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Ideas  from  quantum  physics  play  important  roles  in  many  parts  of  modern 
mathematics.  Many  parts  of  representation  theory,  for  example,  are  moti¬ 
vated  by  quantum  mechanics,  including  the  Wigner-Mackey  theory  of  in¬ 
duced  representations,  the  Kirillov-Kostant  orbit  method,  and,  of  course, 
quantum  groups.  The  Jones  polynomial  in  knot  theory,  the  Gromov-Witten 
invariants  in  topology,  and  mirror  symmetry  in  algebraic  topology  are  other 
notable  examples.  The  awarding  of  the  1990  Fields  Medal  to  Ed  Witten,  a 
physicist,  gives  an  idea  of  the  scope  of  the  influence  of  quantum  theory  in 
mathematics. 

Despite  the  importance  of  quantum  mechanics  to  mathematics,  there  is 
no  easy  way  for  mathematicians  to  learn  the  subject.  Quantum  mechan¬ 
ics  books  in  the  physics  literature  are  generally  not  easily  understood  by 
most  mathematicians.  There  is,  of  course,  a  lower  level  of  mathematical 
precision  in  such  books  than  mathematicians  are  accustomed  to.  In  addi¬ 
tion,  physics  books  on  quantum  mechanics  assume  knowledge  of  classical 
mechanics  that  mathematicians  often  do  not  have.  And,  finally,  there  is  a 
subtle  difference  in  “culture” — differences  in  terminology  and  notation — 
that  can  make  reading  the  physics  literature  like  reading  a  foreign  language 
for  the  mathematician.  There  are  few  books  that  attempt  to  translate  quan¬ 
tum  theory  into  terms  that  mathematicians  can  understand. 

This  book  is  intended  as  an  introduction  to  quantum  mechanics  for  math¬ 
ematicians  with  little  prior  exposure  to  physics.  The  twin  goals  of  the  book 
are  (1)  to  explain  the  physical  ideas  of  quantum  mechanics  in  language 
mathematicians  will  be  comfortable  with,  and  (2)  to  develop  the  neces¬ 
sary  mathematical  tools  to  treat  those  ideas  in  a  rigorous  fashion.  I  have 
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attempted  to  give  a  reasonably  comprehensive  treatment  of  nonrelativistic 
quantum  mechanics,  including  topics  found  in  typical  physics  texts  (e.g., 
the  harmonic  oscillator,  the  hydrogen  atom,  and  the  WKB  approximation) 
as  well  as  more  mathematical  topics  (e.g.,  quantization  schemes,  the  Stone- 
von  Neumann  theorem,  and  geometric  quantization).  I  have  also  attempted 
to  minimize  the  mathematical  prerequisites.  I  do  not  assume,  for  example, 
any  prior  knowledge  of  spectral  theory  or  unbounded  operators,  but  pro¬ 
vide  a  full  treatment  of  those  topics  in  Chaps.  6  through  10  of  the  text. 
Similarly,  I  do  not  assume  familiarity  with  the  theory  of  Lie  groups  and 
Lie  algebras,  but  provide  a  detailed  account  of  those  topics  in  Chap.  16. 
Whenever  possible,  I  provide  full  proofs  of  the  stated  results. 

Most  of  the  text  will  be  accessible  to  graduate  students  in  mathematics 
who  have  had  a  first  course  in  real  analysis,  covering  the  basics  of  L2  spaces 
and  Hilbert  spaces.  Appendix  A  reviews  some  of  the  results  that  are  used  in 
the  main  body  of  the  text.  In  Chaps.  21  and  23,  however,  I  assume  knowl¬ 
edge  of  the  theory  of  manifolds.  I  have  attempted  to  provide  motivation  for 
many  of  the  definitions  and  proofs  in  the  text,  with  the  result  that  there 
is  a  fair  amount  of  discussion  interspersed  with  the  standard  definition- 
theorem-proof  style  of  mathematical  exposition.  There  are  exercises  at  the 
end  of  each  chapter,  making  the  book  suitable  for  graduate  courses  as  well 
as  for  independent  study. 

In  comparison  to  the  present  work,  classics  such  as  Reed  and  Simon  [34] 
and  Glimm  and  Jaffe  [14],  along  with  the  recent  book  of  Schmiidgen  [35], 
are  more  focused  on  the  mathematical  underpinnings  of  the  theory  than 
on  the  physical  ideas.  Hannabuss’s  text  [22]  is  fairly  accessible  to  math¬ 
ematicians,  but — despite  the  word  “graduate”  in  the  title  of  the  series- 
uses  an  undergraduate  level  of  mathematics.  The  recent  book  of  Takhtajan 
[39],  meanwhile,  has  an  expository  bent  to  it,  but  provides  less  physical 
motivation  and  is  less  self-contained  than  the  present  book.  Whereas,  for 
example,  Takhtajan  begins  with  Lagrangian  and  Hamiltonian  mechanics 
on  manifolds,  I  begin  with  “low-tech”  classical  mechanics  on  the  real  line. 
Similarly,  Takhtajan  assumes  knowledge  of  unbounded  operators  and  Lie 
groups,  while  I  provide  substantial  expositions  of  both  of  those  subjects. 
Finally,  there  is  the  work  of  Folland  [13],  which  I  highly  recommend,  but 
which  deals  with  quantum  field  theory,  whereas  the  present  book  treats 
only  nonrelativistic  quantum  mechanics,  except  for  a  very  brief  discussion 
of  quantum  field  theory  in  Sect.  20.6. 

The  book  begins  with  a  quick  introduction  to  the  main  ideas  of  classical 
and  quantum  mechanics.  After  a  brief  account  in  Chap.  1  of  the  historical 
origins  of  quantum  theory,  I  turn  in  Chap.  2  to  a  discussion  of  the  neces¬ 
sary  background  from  classical  mechanics.  This  includes  Newton’s  equa¬ 
tion  in  varying  degrees  of  generality,  along  with  a  discussion  of  important 
physical  quantities  such  as  energy,  momentum,  and  angular  momentum, 
and  conditions  under  which  these  quantities  are  “conserved”  (i.e. ,  constant 
along  each  solution  of  Newton’s  equation).  I  give  a  short  treatment  here 
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of  Poisson  brackets  and  Hamilton’s  form  of  Newton’s  equation,  deferring  a 
full  discussion  of  “fancy”  classical  mechanics  to  Chap.  21. 

In  Chap.  3, 1  attempt  to  motivate  the  structures  of  quantum  mechanics  in 
the  simplest  setting.  Although  I  discuss  the  “axioms”  (in  standard  physics 
terminology)  of  quantum  mechanics,  I  resolutely  avoid  a  strictly  axiomatic 
approach  to  the  subject  (using,  say,  C*-algebras).  Rather,  I  try  to  provide 
some  motivation  for  the  position  and  momentum  operators  and  the  Hilbert 
space  approach  to  quantum  theory,  as  they  connect  to  the  probabilistic  as¬ 
pect  of  the  theory.  I  do  not  attempt  to  explain  the  strange  probabilistic 
nature  of  quantum  theory,  if,  indeed,  there  is  any  explanation  of  it.  Rather, 
I  try  to  elucidate  how  the  wave  function,  along  with  the  position  and  mo¬ 
mentum  operators,  encodes  the  relevant  probabilities. 

In  Chaps.  4  and  5,  we  look  into  two  illustrative  cases  of  the  Schrddinger 
equation  in  one  space  dimension:  a  free  particle  and  a  particle  in  a  square 
well.  In  these  chapters,  we  encounter  such  important  concepts  as  the  dis¬ 
tinction  between  phase  velocity  and  group  velocity  and  the  distinction  be¬ 
tween  a  discrete  and  a  continuous  spectrum. 

In  Chaps.  6  through  10,  we  look  into  some  of  the  technical  mathematical 
issues  that  are  swept  under  the  carpet  in  earlier  chapters.  I  have  tried  to 
design  this  section  of  the  book  in  such  a  way  that  a  reader  can  take  in  as 
much  or  as  little  of  the  mathematical  details  as  desired.  For  a  reader  who 
simply  wants  the  big  picture,  I  outline  the  main  ideas  and  results  of  spec¬ 
tral  theory  in  Chap.  6,  including  a  discussion  of  the  prototypical  example 
of  an  operator  with  a  continuous  spectrum:  the  momentum  operator.  For 
a  reader  who  wants  more  information,  I  provide  statements  of  the  spec¬ 
tral  theorem  (in  two  different  forms)  for  bounded  self-adjoint  operators  in 
Chap.  7,  and  an  introduction  to  the  notion  of  unbounded  self-adjoint  op¬ 
erators  in  Chap.  9.  Finally,  for  the  reader  who  wants  all  the  details,  I  give 
proofs  of  the  spectral  theorem  for  bounded  and  unbounded  self-adjoint 
operators,  in  Chaps.  8  and  10,  respectively. 

In  Chaps.  11  through  14,  we  turn  to  the  vitally  important  canonical  com¬ 
mutation  relations.  These  are  used  in  Chap.  11  to  derive  algebraically  the 
spectrum  of  the  quantum  harmonic  oscillator.  In  Chap.  12,  we  discuss  the 
uncertainty  principle,  both  in  its  general  form  (for  arbitrary  pairs  of  non¬ 
commuting  operators)  and  in  its  specific  form  (for  the  position  and  momen¬ 
tum  operators).  We  pay  careful  attention  to  subtle  domain  issues  that  are 
usually  glossed  over  in  the  physics  literature.  In  Chap.  13,  we  look  at  differ¬ 
ent  “quantization  schemes”  (i.e.,  different  ways  of  ordering  products  of  the 
noncommuting  position  and  momentum  operators).  In  Chap.  14,  we  turn  to 
the  celebrated  Stone-von  Neumann  theorem,  which  provides  a  uniqueness 
result  for  representations  of  the  canonical  commutation  relations.  As  in  the 
case  of  the  uncertainty  principle,  there  are  some  subtle  domain  issues  here 
that  require  attention. 

In  Chaps.  15  through  18,  we  examine  some  less  elementary  issues  in  quan¬ 
tum  theory.  Chapter  15  addresses  the  WKB  (Wentzel-Kramers-Brillouin) 
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approximation,  which  gives  simple  but  approximate  formulas  for  the  eigen¬ 
vectors  and  eigenvalues  for  the  Hamiltonian  operator  in  one  dimension. 
After  this,  we  introduce  (Chap.  16)  the  notion  of  Lie  groups,  Lie  alge¬ 
bras,  and  their  representations,  all  of  which  play  an  important  role  in 
many  parts  of  quantum  mechanics.  In  Chap.  IT,  we  consider  the  example 
of  angular  momentum  and  spin,  which  can  be  understood  in  terms  of  the 
representations  of  the  rotation  group  SO (3).  Here  a  more  mathematical 
approach — especially  the  relationship  between  Lie  group  representations 
and  Lie  algebra  representations — can  substantially  clarify  a  topic  that  is 
rather  mysterious  in  the  physics  literature.  In  particular,  the  concept  of 
“fractional  spin”  can  be  understood  as  describing  a  representation  of  the 
Lie  algebra  of  the  rotation  group  for  which  there  is  no  associated  represen¬ 
tation  of  the  rotation  group  itself.  In  Chap.  18,  we  illustrate  these  ideas  by 
describing  the  energy  levels  of  the  hydrogen  atom,  including  a  discussion 
of  the  hidden  symmetries  of  hydrogen,  which  account  for  the  “accidental 
degeneracy”  in  the  levels.  In  Chap.  19,  we  look  more  closely  at  the  concept 
of  the  “state”  of  a  system  in  quantum  mechanics.  We  look  at  the  notion 
of  subsystems  of  a  quantum  system  in  terms  of  tensor  products  of  Hilbert 
spaces,  and  we  see  in  this  setting  that  the  notion  of  “pure  state”  (a  unit 
vector  in  the  relevant  Hilbert  space)  is  not  adequate.  We  are  led,  then,  to 
the  notion  of  a  mixed  state  (or  density  matrix).  We  also  examine  the  idea 
that,  in  quantum  mechanics,  “identical  particles  are  indistinguishable.” 

Finally,  in  Chaps.  21  through  23,  we  examine  some  advanced  topics  in 
classical  and  quantum  mechanics.  We  begin,  in  Chap.  20,  by  considering  the 
path  integral  formulation  of  quantum  mechanics,  both  from  the  heuristic 
perspective  of  the  Feynman  path  integral,  and  from  the  rigorous  perspective 
of  the  Feynman-Kac  formula.  Then,  in  Chap.  21,  we  give  a  brief  treatment 
of  Hamiltonian  mechanics  on  manifolds.  Finally,  we  consider  the  machinery 
of  geometric  quantization,  beginning  with  the  Euclidean  case  in  Chap.  22 
and  continuing  with  the  general  case  in  Chap.  23. 

I  am  grateful  to  all  who  have  offered  suggestions  or  made  corrections 
to  the  manuscript,  including  Renato  Bettiol,  Edward  Burkard,  Matt  Cecil, 
Tiancong  Chen,  Bo  Jacoby,  Will  Kirwin,  Nicole  Kroeger,  Wicharn  Lewkeer- 
atiyutkul,  Jeff  Mitchell,  Eleanor  Pettus,  Ambar  Sengupta,  and  Augusto 
Stoffel.  I  am  particularly  grateful  to  Michel  Talagrand  who  read  almost 
the  entire  manuscript  and  made  numerous  corrections  and  suggestions.  Fi¬ 
nally,  I  offer  a  special  word  of  thanks  to  my  advisor  and  friend,  Leonard 
Gross,  who  started  me  on  the  path  toward  understanding  the  mathemati¬ 
cal  foundations  of  quantum  mechanics.  Readers  are  encouraged  to  send  me 
comments  or  corrections  at  bhall@nd.edu. 


Notre  Dame,  IN,  USA 


Brian  C.  Hall 
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1 

The  Experimental  Origins  of  Quantum 
Mechanics 


Quantum  mechanics,  with  its  controversial  probabilistic  nature  and  curious 
blending  of  waves  and  particles,  is  a  very  strange  theory.  It  was  not 
invented  because  anyone  thought  this  is  the  way  the  world  should  behave, 
but  because  various  experiments  showed  that  this  is  the  way  the  world 
does  behave,  like  it  or  not.  Craig  Hogan,  director  of  the  Fermilab  Particle 
Astrophysics  Center,  put  it  this  way: 

No  theorist  in  his  right  mind  would  have  invented  quantum 

mechanics  unless  forced  to  by  data.1 

Although  the  first  hint  of  quantum  mechanics  came  in  1900  with  Planck’s 
solution  to  the  problem  of  blackbody  radiation,  the  full  theory  did  not 
emerge  until  1925-1926,  with  Heisenberg’s  matrix  model,  Schrodinger’s 
wave  model,  and  Born’s  statistical  interpretation  of  the  wave  model. 


1.1  Is  Light  a  Wave  or  a  Particle? 

1.1.1  Newton  Versus  Huygens 

Beginning  in  the  late  seventeenth  century  and  continuing  into  the  early 
eighteenth  century,  there  was  a  vigorous  debate  in  the  scientific  community 


1  Quoted  in  “Is  Space  Digital?”  by  Michael  Moyer,  Scientific  American ,  February 
2012,  pp.  30-36. 


B.C.  Hall,  Quantum  Theory  for  Mathematicians ,  Graduate  Texts 
in  Mathematics  267,  DOI  10.1007/978-l-4614-7116-5_l, 

©  Springer  Science+Business  Media  New  York  2013 


1 


2 


1.  The  Experimental  Origins  of  Quantum  Mechanics 


over  the  nature  of  light.  One  camp,  following  the  views  of  Isaac 
Newton,  claimed  that  light  consisted  of  a  group  of  particles  or  “corpus¬ 
cles.”  The  other  camp,  led  by  the  Dutch  physicist  Christiaan  Huygens, 
claimed  that  light  was  a  wave.  Newton  argued  that  only  a  corpuscular  the¬ 
ory  could  account  for  the  observed  tendency  of  light  to  travel  in  straight 
lines.  Huygens  and  others,  on  the  other  hand,  argued  that  a  wave  theory 
could  explain  numerous  observed  aspects  of  light,  including  the  bending 
or  “refraction”  of  light  as  it  passes  from  one  medium  to  another,  as  from 
air  into  water.  Newton’s  reputation  was  such  that  his  “corpuscular”  theory 
remained  the  dominant  one  until  the  early  nineteenth  century. 


1.1.2  The  Ascendance  of  the  Wave  Theory  of  Light 

In  1804,  Thomas  Young  published  two  papers  describing  and  explaining 
his  double-slit  experiment.  In  this  experiment,  sunlight  passes  through  a 
small  hole  in  a  piece  of  cardboard  and  strikes  another  piece  of  cardboard 
containing  two  small  holes.  The  light  then  strikes  a  third  piece  of  cardboard, 
where  the  pattern  of  light  may  be  observed.  Young  observed  “fringes”  or 
alternating  regions  of  high  and  low  intensity  for  the  light.  Young  believed 
that  light  was  a  wave  and  he  postulated  that  these  fringes  were  the  result 
of  interference  between  the  waves  emanating  from  the  two  holes.  Young 
drew  an  analogy  between  light  and  water,  where  in  the  case  of  water, 
interference  is  readily  observed.  If  two  circular  waves  of  water  cross  each 
other,  there  will  be  some  points  where  a  peak  of  one  wave  matches  up  with 
a  trough  of  another  wave,  resulting  in  destructive  interference ,  that  is,  a 
partial  cancellation  between  the  two  waves,  resulting  in  a  small  amplitude 
of  the  combined  wave  at  that  point.  At  other  points,  on  the  other  hand,  a 
peak  in  one  wave  will  line  up  with  a  peak  in  the  other,  or  a  trough  with 
a  trough.  At  such  points,  there  is  constructive  interference ,  with  the  result 
that  the  amplitude  of  the  combined  wave  is  large  at  that  point.  The  pattern 
of  constructive  and  destructive  interference  will  produce  something  like  a 
checkerboard  pattern  of  alternating  regions  of  large  and  small  amplitudes 
in  the  combined  wave.  The  dimensions  of  each  region  will  be  roughly  on 
the  order  of  the  wavelength  of  the  individual  waves. 

Based  on  this  analogy  with  water  waves,  Young  was  able  to  explain  the 
interference  fringes  that  he  observed  and  to  predict  the  wavelength  that 
light  must  have  in  order  for  the  specific  patterns  he  observed  to  occur. 
Based  on  his  observations,  Young  claimed  that  the  wavelength  of  visible 
light  ranged  from  about  1/36,000  in.  (about  700  nm)  at  the  red  end  of  the 
spectrum  to  about  1/60,000  in.  (about  425  nm)  at  the  violet  end  of  the 
spectrum,  results  that  agree  with  modern  measurements. 

Figure  1.1  shows  how  circular  waves  emitted  from  two  different  points 
form  an  interference  pattern.  One  should  think  of  Young’s  second  piece  of 
cardboard  as  being  at  the  top  of  the  figure,  with  holes  near  the  top  left  and 
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FIGURE  1.1.  Interference  of  waves  emitted  from  two  slits. 


top  right  of  the  figure.  Figure  1.2  then  plots  the  intensity  (i.e.,  the  square  of 
the  displacement)  as  a  function  of  x,  with  y  having  the  value  corresponding 
to  the  bottom  of  Fig.  1.1. 

Despite  the  convincing  nature  of  Young’s  experiment,  many  proponents 
of  the  corpuscular  theory  of  light  remained  unconvinced.  In  1818,  the 
French  Academy  of  Sciences  set  up  a  competition  for  papers  explaining 
the  observed  properties  of  light.  One  of  the  submissions  was  a  paper  by 
Augustin-Jean  Fresnel  in  which  he  elaborated  on  Huygens’s  wave  model 
of  refraction.  A  supporter  of  the  corpuscular  theory  of  light,  Simeon-Denis 
Poisson  read  Fresnel’s  submission  and  ridiculed  it  by  pointing  out  that 
if  that  theory  were  true,  light  passing  by  an  opaque  disk  would  diffract 
around  the  edges  of  the  disk  to  produce  a  bright  spot  in  the  center  of  the 
shadow  of  the  disk,  a  prediction  that  Poisson  considered  absurd.  Never¬ 
theless,  the  head  of  the  judging  committee  for  the  competition,  Francois 
Arago,  decided  to  put  the  issue  to  an  experimental  test  and  found  that 
such  a  spot  does  in  fact  occur.  Although  this  spot  is  often  called  “Arago’s 
spot,”  or  even,  ironically,  “Poisson’s  spot,”  Arago  eventually  realized  that 
the  spot  had  been  observed  100  years  earlier  in  separate  experiments  by 
Delisle  and  Maraldi. 

Arago’s  observation  of  Poisson’s  spot  led  to  widespread  acceptance  of 
the  wave  theory  of  light.  This  theory  gained  even  greater  acceptance  in 
1865,  when  James  Clerk  Maxwell  put  together  what  are  today  known  as 
Maxwell’s  equations.  Maxwell  showed  that  his  equations  predicted  that 
electromagnetic  waves  would  propagate  at  a  certain  speed,  which  agreed 
with  the  observed  speed  of  light.  Maxwell  thus  concluded  that  light  is  sim¬ 
ply  an  electromagnetic  wave.  From  1865  until  the  end  of  the  nineteenth 
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FIGURE  1.2.  Intensity  plot  for  a  horizontal  line  across  the  bottom  of  Fig.  1.1 


century,  the  debate  over  the  wave-versus-particle  nature  of  light  was  con¬ 
sidered  to  have  been  conclusively  settled  in  favor  of  the  wave  theory. 

1.1.3  Blackbody  Radiation 

In  the  early  twentieth  century,  the  wave  theory  of  light  began  to  experience 
new  challenges.  The  first  challenge  came  from  the  theory  of  blackbody  radia¬ 
tion.  In  physics,  a  blackbody  is  an  idealized  object  that  perfectly  absorbs  all 
electromagnetic  radiation  that  hits  it.  A  blackbody  can  be  approximated  in 
the  real  world  by  an  object  with  a  highly  absorbent  surface  such  as  “lamp 
black.”  The  problem  of  blackbody  radiation  concerns  the  distribution  of 
electromagnetic  radiation  in  a  cavity  within  a  blackbody.  Although  the 
walls  of  the  blackbody  absorb  the  radiation  that  hits  it,  thermal  vibrations 
of  the  atoms  making  up  the  walls  cause  the  blackbody  to  emit  electromag¬ 
netic  radiation.  (At  normal  temperatures,  most  of  the  radiation  emitted 
would  be  in  the  infrared  range.) 

In  the  cavity,  then,  electromagnetic  radiation  is  constantly  absorbed  and 
re-emitted  until  thermal  equilibrium  is  reached,  at  which  point  the  absorp¬ 
tion  and  emission  of  radiation  are  perfectly  balanced  at  each  frequency. 
According  to  the  “equipartition  theorem”  of  (classical)  statistical  mechan¬ 
ics,  the  energy  in  any  given  mode  of  electromagnetic  radiation  should  be 
exponentially  distributed,  with  an  average  value  equal  to  /c#T,  where  T  is 
the  temperature  and  ks  is  Boltzmann’s  constant.  (The  temperature  should 
be  measured  on  a  scale  where  absolute  zero  corresponds  to  T  =  0.)  The  dif¬ 
ficulty  with  this  prediction  is  that  the  average  amount  of  energy  is  the  same 
for  every  mode  (hence  the  term  “equipartition”).  Thus,  once  one  adds  up 
over  all  modes — of  which  there  are  infinitely  many — the  predicted  amount 
of  energy  in  the  cavity  is  infinite.  This  strange  prediction  is  referred  to  as 
the  ultraviolet  catastrophe ,  since  the  infinitude  of  the  energy  comes  from  the 
ultraviolet  (high-frequency)  end  of  the  spectrum.  This  ultraviolet  catastro¬ 
phe  does  not  seem  to  make  physical  sense  and  certainly  does  not  match  up 
with  the  observed  energy  spectrum  within  real-world  blackbodies. 
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An  alternative  prediction  of  the  blackbody  energy  spectrum  was  offered 
by  Max  Planck  in  a  paper  published  in  1900.  Planck  postulated  that 
the  energy  in  the  electromagnetic  field  at  a  given  frequency  ce  should  be 
“quantized,”  meaning  that  this  energy  should  come  only  in  integer  mul¬ 
tiples  of  a  certain  basic  unit  equal  to  hu,  where  h  is  a  constant,  which 
we  now  call  Planck’s  constant.  Planck  postulated  that  the  energy  would 
again  be  exponentially  distributed,  but  only  over  integer  multiples  of  Huj. 
At  low  frequencies,  Planck’s  theory  predicts  essentially  the  same  energy  as 
in  classical  statistical  mechanics.  At  high  frequencies,  namely  at  frequen¬ 
cies  where  Huj  is  large  compared  to  Planck’s  theory  predicts  a  rapid 

fall-off  of  the  average  energy  (see  Exercise  2  for  details).  Indeed,  if  we  mea¬ 
sure  mass,  distance,  and  time  in  units  of  grams,  centimeters,  and  seconds, 
respectively,  and  we  assign  h  the  numerical  value 

h  =  1.054  x  10“27, 

then  Planck’s  predictions  match  the  experimentally  observed  blackbody 
spectrum. 

Planck  pictured  the  walls  of  the  blackbody  as  being  made  up  of  inde¬ 
pendent  oscillators  of  different  frequencies,  each  of  which  is  restricted  to 
have  energies  of  Huj.  Although  this  picture  was  clearly  not  intended  as  a 
realistic  physical  explanation  of  the  quantization  of  electromagnetic  energy 
in  blackbodies,  it  does  suggest  that  Planck  thought  that  energy  quantiza¬ 
tion  arose  from  properties  of  the  walls  of  the  cavity,  rather  than  in  intrinsic 
properties  of  the  electromagnetic  radiation.  Einstein,  on  the  other  hand,  in 
assessing  Planck’s  model,  argued  that  energy  quantization  was  inherent  in 
the  radiation  itself.  In  Einstein’s  picture,  then,  electromagnetic  energy  at 
a  given  frequency — whether  in  a  blackbody  cavity  or  not — comes  in  pack¬ 
ets  or  quanta  having  energy  proportional  to  the  frequency.  Each  quantum 
of  electromagnetic  energy  constitutes  what  we  now  call  a  photon ,  which 
we  may  think  of  as  a  particle  of  light.  Thus,  Planck’s  model  of  blackbody 
radiation  began  a  rebirth  of  the  particle  theory  of  light. 

It  is  worth  mentioning,  in  passing,  that  in  1900,  the  same  year  in  which 
Planck’s  paper  on  blackbody  radiation  appeared,  Lord  Kelvin  gave  a  lec¬ 
ture  that  drew  attention  to  another  difficulty  with  the  classical  theory 
of  statistical  mechanics.  Kelvin  described  two  “clouds”  over  nineteenth- 
century  physics  at  the  dawn  of  the  twentieth  century.  The  first  of  these 
clouds  concerned  aether — a  hypothetical  medium  through  which  electro¬ 
magnetic  radiation  propagates — and  the  failure  of  Michelson  and  Morley  to 
observe  the  motion  of  earth  relative  to  the  aether.  Under  this  cloud  lurked 
the  theory  of  special  relativity.  The  second  of  Kelvin’s  clouds  concerned 
heat  capacities  in  gases.  The  equipartition  theorem  of  classical  statisti¬ 
cal  mechanics  made  predictions  for  the  ratio  of  heat  capacity  at  constant 
pressure  ( cp )  and  the  heat  capacity  at  constant  volume  ( cv ).  These  pre¬ 
dictions  deviated  substantially  from  the  experimentally  measured  ratios. 
Under  the  second  cloud  lurked  the  theory  of  quantum  mechanics,  because 
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the  resolution  of  this  discrepancy  is  similar  to  Planck’s  resolution  of  the 
blackbody  problem.  As  in  the  case  of  blackbody  radiation,  quantum  me¬ 
chanics  gives  rise  to  a  correction  to  the  equipartition  theorem,  thus  result¬ 
ing  in  different  predictions  for  the  ratio  of  cp  to  cv,  predictions  that  can  be 
reconciled  with  the  observed  ratios. 

1.1.4  The  Photoelectric  Effect 

The  year  1905  was  Einstein’s  annus  mirabilis  (miraculous  year),  in  which 
Einstein  published  four  ground-breaking  papers,  two  on  the  special  theory 
of  relativity  and  one  each  on  Brownian  motion  and  the  photoelectric  effect. 
It  was  for  the  photoelectric  effect  that  Einstein  won  the  Nobel  Prize  in 
physics  in  1921.  In  the  photoelectric  effect,  electromagnetic  radiation  strik¬ 
ing  a  metal  causes  electrons  to  be  emitted  from  the  metal.  Einstein  found 
that  as  one  increases  the  intensity  of  the  incident  light,  the  number  of  emit¬ 
ted  electrons  increases,  but  the  energy  of  each  electron  does  not  change. 
This  result  is  difficult  to  explain  from  the  perspective  of  the  wave  theory  of 
light.  After  all,  if  light  is  simply  an  electromagnetic  wave,  then  increasing 
the  intensity  of  the  light  amounts  to  increasing  the  strength  of  the  electric 
and  magnetic  fields  involved.  Increasing  the  strength  of  the  fields,  in  turn, 
ought  to  increase  the  amount  of  energy  transferred  to  the  electrons. 

Einstein’s  results,  on  the  other  hand,  are  readily  explained  from  a  particle 
theory  of  light.  Suppose  light  is  actually  a  stream  of  particles  (photons)  with 
the  energy  of  each  particle  determined  by  its  frequency.  Then  increasing 
the  intensity  of  light  at  a  given  frequency  simply  increases  the  number  of 
photons  and  does  not  affect  the  energy  of  each  photon.  If  each  photon  has 
a  certain  likelihood  of  hitting  an  electron  and  causing  it  to  escape  from 
the  metal,  then  the  energy  of  the  escaping  electron  will  be  determined 
by  the  frequency  of  the  incident  light  and  not  by  the  intensity  of  that 
light.  The  photoelectric  effect,  then,  provided  another  compelling  reason 
for  believing  that  light  can  behave  in  a  particlelike  manner. 

1.1.5  The  Double-Slit  Experiment ,  Revisited 

Although  the  work  of  Planck  and  Einstein  suggests  that  there  is  a  par¬ 
ticlelike  aspect  to  light,  there  is  certainly  also  a  wavelike  aspect  to  light, 
as  shown  by  Young,  Arago,  and  Maxwell,  among  others.  Thus,  somehow, 
light  must  in  some  situations  behave  like  a  wave  and  in  some  situations 
like  a  particle,  a  phenomenon  known  as  “wave-particle  duality.”  William 
Lawrence  Bragg  described  the  situation  thus: 

God  runs  electromagnetics  on  Monday,  Wednesday,  and  Friday 
by  the  wave  theory,  and  the  devil  runs  them  by  quantum  theory 
on  Tuesday,  Thursday,  and  Saturday. 

(Apparently  Sunday,  being  a  day  of  rest,  did  not  need  to  be  accounted  for.) 
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7 


In  particular,  we  have  already  seen  that  Young’s  double-slit  experiment 
in  the  early  nineteenth  century  was  one  important  piece  of  evidence  in  fa¬ 
vor  of  the  wave  theory  of  light.  If  light  is  really  made  up  of  particles,  as 
blackbody  radiation  and  the  photoelectric  effect  suggest,  one  must  give  a 
particle-based  explanation  of  the  double-slit  experiment.  J.  J.  Thomson  sug¬ 
gested  in  1907  that  the  patterns  of  light  seen  in  the  double-slit  experiment 
could  be  the  result  of  different  photons  somehow  interfering  with  one  an¬ 
other.  Thomson  thus  suggested  that  if  the  intensity  of  light  were  sufficiently 
reduced,  the  photons  in  the  light  would  become  widely  separated  and  the 
interference  pattern  might  disappear.  In  1909,  Geoffrey  Ingram  Taylor  set 
out  to  test  this  suggestion  and  found  that  even  when  the  intensity  of  light 
was  drastically  reduced  (to  the  point  that  it  took  three  months  for  one  of 
the  images  to  form),  the  interference  pattern  remained  the  same. 

Since  Taylor’s  results  suggest  that  interference  remains  even  when  the 
photons  are  widely  separated,  the  photons  are  not  interfering  with  one  an¬ 
other.  Rather,  as  Paul  Dirac  put  it  in  Chap.  1  of  [6],  “Each  photon  then 
interferes  only  with  itself.”  To  state  this  in  a  different  way,  since  there  is  no 
interference  when  there  is  only  one  slit,  Taylor’s  results  suggest  that  each 
individual  photon  passes  through  both  slits.  By  the  early  1960s,  it  became 
possible  to  perform  double-slit  experiments  with  electrons  instead  of  pho¬ 
tons,  yielding  even  more  dramatic  confirmations  of  the  strange  behavior  of 
matter  in  the  quantum  realm.  (See  Sect.  1.2.4.) 


1.2  Is  an  Electron  a  Wave  or  a  Particle? 

In  the  early  part  of  the  twentieth  century,  the  atomic  theory  of  matter 
became  firmly  established.  (Einstein’s  1905  paper  on  Brownian  motion  was 
an  important  confirmation  of  the  theory  and  provided  the  first  calculation 
of  atomic  masses  in  everyday  units.)  Experiments  performed  in  1909  by 
Hans  Geiger  and  Ernest  Marsden,  under  the  direction  of  Ernest  Rutherford, 
led  Rutherford  to  put  forward  in  1911  a  picture  of  atoms  in  which  a  small 
nucleus  contains  most  of  the  mass  of  the  atom.  In  Rutherford’s  model, 
each  atom  has  a  positively  charged  nucleus  with  charge  nq,  where  n  is 
a  positive  integer  (the  atomic  number )  and  q  is  the  basic  unit  of  charge 
first  observed  in  Millikan’s  famous  oil-drop  experiment.  Surrounding  the 
nucleus  is  a  cloud  of  n  electrons,  each  having  negative  charge  —q.  When 
atoms  bind  into  molecules,  some  of  the  electrons  of  one  atom  may  be  shared 
with  another  atom  to  form  a  bond  between  the  atoms.  This  picture  of  atoms 
and  their  binding  led  to  the  modern  theory  of  chemistry. 

Basic  to  the  atomic  theory  is  that  electrons  are  particles;  indeed,  the 
number  of  electrons  per  atom  is  supposed  to  be  the  atomic  number.  Never¬ 
theless,  it  did  not  take  long  after  the  atomic  theory  of  matter  was  confirmed 
before  wavelike  properties  of  electrons  began  to  be  observed.  The  situation, 
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then,  is  the  reverse  of  that  with  light.  While  light  was  long  thought  to  be 
a  wave  (at  least  from  the  publication  of  Maxwell’s  equations  in  1865  until 
Planck’s  work  in  1900)  and  was  only  later  seen  to  have  particlelike  behavior, 
electrons  were  initially  thought  to  be  particles  and  were  only  later  seen  to 
have  wavelike  properties.  In  the  end,  however,  both  light  and  electrons  have 
both  wavelike  and  particlelike  properties. 

1.2.1  The  Spectrum  of  Hydrogen 

If  electricity  is  passed  through  a  tube  containing  hydrogen  gas,  the  gas  will 
emit  light.  If  that  light  is  separated  into  different  frequencies  by  means 
of  a  prism,  bands  will  become  apparent,  indicating  that  the  light  is  not  a 
continuous  mix  of  many  different  frequencies,  but  rather  consists  only  of  a 
discrete  family  of  frequencies.  In  view  of  the  photonic  theory  of  light,  the 
energy  in  each  photon  is  proportional  to  its  frequency.  Thus,  each  observed 
frequency  corresponds  to  a  certain  amount  of  energy  being  transferred  from 
a  hydrogen  atom  to  the  electromagnetic  field. 

Now,  a  hydrogen  atom  consists  of  a  single  proton  surrounded  by  a  single 
electron.  Since  the  proton  is  much  more  massive  than  the  electron,  one 
can  picture  the  proton  as  being  stationary,  with  the  electron  orbiting  it. 
The  idea,  then,  is  that  the  current  being  passed  through  the  gas  causes  some 
of  the  electrons  to  move  to  a  higher-energy  state.  Eventually,  that  electron 
will  return  to  a  lower-energy  state,  emitting  a  photon  in  the  process.  In  this 
way,  by  observing  the  energies  (or,  equivalently,  the  frequencies)  of  the 
emitted  photons,  one  can  work  backwards  to  the  change  in  energy  of  the 
electron. 

The  curious  thing  about  the  state  of  affairs  in  the  preceding  paragraph 
is  that  the  energies  of  the  emitted  photons — and  hence,  also,  the  energies 
of  the  electron — come  only  in  a  discrete  family  of  possible  values.  Based 
on  the  observed  frequencies,  Johannes  Rydberg  concluded  in  1888  that  the 
possible  energies  of  the  electron  were  of  the  form 

En  =  -  4'  (1-1) 

Here,  R  is  the  “Rydberg  constant,”  given  (in  “Gaussian  units”)  by 

meQ4 


where  Q  is  the  charge  of  the  electron  and  me  is  the  mass  of  the  electron. 
(Technically,  me  should  be  replaced  by  the  reduced  mass  y  of  the  proton- 
electron  system;  that  is,  y  =  memp/ (me  +  rap),  where  mp  is  the  mass 
of  the  proton.  However,  since  the  proton  mass  is  much  greater  than  the 
electron  mass,  y  is  almost  the  same  as  me  and  we  will  neglect  the  difference 
between  the  two.)  The  energies  in  (1.1)  agree  with  experiment,  in  that  all 


1.2  Is  an  Electron  a  Wave  or  a  Particle? 


9 


the  observed  frequencies  in  hydrogen  are  (at  least  to  the  precision  available 
at  the  time  of  Rydberg)  of  the  form 

a;  =  —  ( En  —  Em) ,  (1-2) 

for  some  n  >  m.  It  should  be  noted  that  Johann  Balmer  had  already 
observed  in  1885  frequencies  of  the  same  form,  but  only  in  the  case  nn  —  2, 
and  that  Balmer’s  work  influenced  Rydberg. 

The  frequencies  in  (1.2)  are  known  as  the  spectrum  of  hydrogen.  Balmer 
and  Rydberg  were  merely  attempting  to  find  a  simple  formula  that  would 
match  the  observed  frequencies  in  hydrogen.  Neither  of  them  had  a  the¬ 
oretical  explanation  for  why  only  these  particular  frequencies  occur.  Such 
an  explanation  would  have  to  wait  until  the  beginnings  of  quantum  theory 
in  the  twentieth  century. 

1.2.2  The  Bohr-de  Broglie  Model  of  the  Hydrogen  Atom 

In  1913,  Niels  Bohr  introduced  a  model  of  the  hydrogen  atom  that  at¬ 
tempted  to  explain  the  observed  spectrum  of  hydrogen.  Bohr  pictured  the 
hydrogen  atom  as  consisting  of  an  electron  orbiting  a  positively  charged 
nucleus,  in  much  the  same  way  that  a  planet  orbits  the  sun.  Classically, 
the  force  exerted  on  the  electron  by  the  proton  follows  the  inverse  square 
law  of  the  form 

n 2 

F=^,  (1.3) 

where  Q  is  the  charge  of  the  electron,  in  appropriate  units. 

If  the  electron  is  in  a  circular  orbit,  its  trajectory  in  the  plane  of  the 
orbit  will  take  the  form 

(x(t),y(t))  =  (r  cos(cjt),  r  sin(cet)). 

If  we  take  the  second  derivative  with  respect  to  time  to  obtain  the  acceler¬ 
ation  vector  a,  we  obtain 

a (t)  =  (— uo2r  cos(cjt),  —  uj2r  sin(cet)), 

so  that  the  magnitude  of  the  acceleration  vector  is  ce2r.  Newton’s  second 
law,  F  =  ma ,  then  requires  that 
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From  the  formula  for  the  frequency,  we  can  calculate  that  the  momentum 
(mass  times  velocity)  has  magnitude 


We  can  also  calculate  the  angular  momentum  J,  which  for  a  circular  orbit 
is  just  the  momentum  times  the  distance  from  the  nucleus,  as 


J  =  meQ2r. 


Bohr  postulated  that  the  electron  obeys  classical  mechanics,  except  that 
its  angular  momentum  is  “quantized.”  Specifically,  in  Bohr’s  model,  the 
angular  momentum  is  required  to  be  an  integer  multiple  of  h  (Planck’s 
constant).  Setting  J  equal  to  nh  yields 

n2h2 

rn  =  7x2 ' 

meQz 

If  one  calculates  the  energy  of  an  orbit  with  radius  rn,  one  finds  (Exercise  3) 
that  it  agrees  precisely  with  the  Rydberg  energies  in  (1.1).  Bohr  further 
postulated  that  an  electron  could  move  from  one  allowed  state  to  another, 
emitting  a  packet  of  light  in  the  process  with  frequency  given  by  (1.2). 

Bohr  did  not  explain  why  the  angular  momentum  of  an  electron  is  quan¬ 
tized,  nor  how  it  moved  from  one  allowed  orbit  to  another.  As  such,  his 
theory  of  atomic  behavior  was  clearly  not  complete;  it  belongs  to  the  “old 
quantum  mechanics”  that  was  superseded  by  the  matrix  model  of  Heisen¬ 
berg  and  the  wave  model  of  Schrodinger.  Nevertheless,  Bohr’s  model  was  an 
important  step  in  the  process  of  understanding  the  behavior  of  atoms,  and 
Bohr  was  awarded  the  1922  Nobel  Prize  in  physics  for  his  work.  Some  rem¬ 
nant  of  Bohr’s  approach  survives  in  modern  quantum  theory,  in  the  WKB 
approximation  (Chap.  15),  where  the  Bohr-Sommerfeld  condition  gives  an 
approximation  to  the  energy  levels  of  a  one-dimensional  quantum  system. 

In  1924,  Louis  de  Broglie  reinterpreted  Bohr’s  condition  on  the  angular 
momentum  as  a  wave  condition.  The  de  Broglie  hypothesis  is  that  an  elec¬ 
tron  can  be  described  by  a  wave,  where  the  spatial  frequency  k  of  the  wave 
is  related  to  the  momentum  of  the  electron  by  the  relation 

p  =  hk.  (1.6) 

Here,  “frequency”  is  defined  so  that  the  frequency  of  the  function  cos (kx) 
is  k.  This  is  “angular”  frequency,  which  differs  by  a  factor  of  2n  from  the 
cycles-per-unit-distance  frequency.  Thus,  the  period  associated  with  a  given 
frequency  k  is  27 r/k. 

In  de  Broglie’s  approach,  we  are  supposed  to  imagine  a  wave  super¬ 
imposed  on  the  classical  trajectory  of  the  electron,  with  the  quantization 
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FIGURE  1.3.  The  Bohr  radii  for  n  =  1  to  n  =  10,  with  de  Broglie  waves  super¬ 
imposed  for  n  =  8  and  n  —  10. 

condition  now  being  that  the  wave  should  match  up  with  itself  when  going 
all  the  way  around  the  orbit.  This  condition  means  that  the  orbit  should 
consist  of  an  integer  number  of  periods  of  the  wave: 

_  2tt 

2nr  =  n  —— . 

k 

Using  (1.6)  along  with  the  expression  (1.4)  for  p,  we  obtain 


2  nr  =  ri2n—  =  2nnh 
P 

Solving  this  equation  for  r  gives  precisely  the  Bohr  radii  in  (1.5). 

Thus,  de  Broglie’s  wave  hypothesis  gives  an  alternative  to  Bohr’s  quan¬ 
tization  of  angular  momentum  as  an  explanation  of  the  allowed  energies  of 
hydrogen.  Of  course,  if  one  accepts  de  Broglie’s  wave  hypothesis  for  elec¬ 
trons,  one  would  expect  to  see  wavelike  behavior  of  electrons  not  just  in  the 
hydrogen  atom,  but  in  other  situations  as  well,  an  expectation  that  would 
soon  be  fulfilled.  Figure  1.3  shows  the  first  10  Bohr  radii.  For  the  8th  and 
10th  radii,  the  de  Broglie  wave  is  shown  superimposed  onto  the  orbit. 

1.2.3  Electron  Diffraction 

In  1925,  Clinton  Davisson  and  Lester  Germer  were  studying  properties  of 
nickel  by  bombarding  a  thin  film  of  nickel  with  low-energy  electrons.  As  a 
result  of  a  problem  with  their  equipment,  the  nickel  was  accidentally  heated 
to  a  very  high  temperature.  When  the  nickel  cooled,  it  formed  into  large 
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crystalline  pieces,  rather  than  the  small  crystals  in  the  original  sample. 
After  this  recrystallization,  Davisson  and  Germer  observed  peaks  in  the 
pattern  of  electrons  reflecting  off  of  the  nickel  sample  that  had  not  been 
present  when  using  the  original  sample.  They  were  at  a  loss  to  explain  this 
pattern  until,  in  1926,  Davisson  learned  of  the  de  Broglie  hypothesis  and 
suspected  that  they  were  observing  the  wavelike  behavior  of  electrons  that 
de  Broglie  had  predicted. 

After  this  realization,  Davisson  and  Germer  began  to  look  systemati¬ 
cally  for  wavelike  peaks  in  their  experiments.  Specifically,  they  attempted 
to  show  that  the  pattern  of  angles  at  which  the  electrons  reflected  matched 
the  patterns  one  sees  in  x-ray  diffraction.  After  numerous  additional  mea¬ 
surements,  they  were  able  to  show  a  very  close  correspondence  between 
the  pattern  of  electrons  and  the  patterns  seen  in  x-ray  diffraction.  Since 
x-rays  were  by  this  time  known  to  be  waves  of  electromagnetic  radiation, 
the  Davisson-Germer  experiment  was  a  strong  confirmation  of  de  Broglie’s 
wave  picture  of  electrons.  Davisson  and  Germer  published  their  results  in 
two  papers  in  1927,  and  Davisson  shared  the  1937  Nobel  Prize  in  physics 
with  George  Paget,  who  had  observed  electron  diffraction  shortly  after 
Davisson  and  Germer. 

1.2.4  The  Double- Slit  Experiment  with  Electrons 

Although  quantum  theory  clearly  predicts  that  electrons  passing  through 
a  double  slit  will  experience  interference  similar  to  that  observed  in  light, 
it  was  not  until  Clauss  Jdnsson’s  work  in  1961  that  this  prediction  was 
confirmed  experimentally.  The  main  difficulty  is  the  much  smaller  wave¬ 
length  for  electrons  of  reasonable  energy  than  for  visible  light.  Jdnsson’s 
electrons,  for  example,  had  a  de  Broglie  wavelength  of  5  nm,  as  compared  to 
a  wavelength  of  roughly  500 nm  for  visible  light  (depending  on  the  color). 

In  results  published  in  1989,  a  team  led  by  Akira  Tonomura  at  Hitachi 
performed  a  double-slit  experiment  in  which  they  were  able  to  record  the 
results  one  electron  at  a  time.  (Similar  but  less  definitive  experiments  were 
carried  out  by  Pier  Giorgio  Merli,  GianFranco  Missiroli  and  Giulio  Pozzi 
in  Bologna  in  1974  and  published  in  the  American  Journal  of  Physics  in 
1976.)  In  the  Hitachi  experiment,  each  electron  passes  through  the  slits  and 
then  strikes  a  screen,  causing  a  small  spot  of  light  to  appear.  The  location  of 
this  spot  is  then  recorded  for  each  electron,  one  at  a  time.  The  key  point  is 
that  each  individual  electron  strikes  the  screen  at  a  single  point.  That  is  to 
say,  individual  electrons  are  not  smeared  out  across  the  screen  in  a  wavelike 
pattern,  but  rather  behave  like  point  particles,  in  that  the  observed  location 
of  the  electron  is  indeed  a  point.  Each  electron,  however,  strikes  the  screen 
at  a  different  point,  and  once  a  large  number  of  the  electrons  have  struck 
and  their  locations  have  been  recorded,  an  interference  pattern  emerges. 

It  is  not  the  variability  of  the  locations  of  the  electrons  that  is  surprising, 
since  this  could  be  accounted  for  by  small  variations  in  the  way  the  electrons 
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FIGURE  1.4.  Four  images  from  the  1989  experiment  at  Hitachi  showing  the 
impact  of  individual  electrons  gradually  building  up  to  form  an  interference  pat¬ 
tern.  Image  by  Akira  Tonomura  and  Wikimedia  Commons  user  Belsazar.  File 
is  licensed  under  the  Creative  Commons  Attribution-Share  Alike  3.0  Unported 
license. 


are  shot  toward  the  slits.  Rather,  it  is  the  distinctive  interference  pattern 
that  is  surprising,  with  rapid  variations  in  the  pattern  of  electron  strikes 
over  short  distances,  including  regions  where  almost  no  electron  strikes 
occur.  (Compare  Fig.  1.4  to  Fig.  1.2.)  Note  also  that  in  the  experiment,  the 
electrons  are  widely  separated,  so  that  there  is  never  more  than  one  electron 
in  the  apparatus  at  any  one  time.  Thus,  the  electrons  cannot  interfere  with 
one  another;  rather,  each  electron  interferes  with  itself  Figure  1.4  shows 
results  from  the  Hitachi  experiment,  with  the  number  of  observed  electrons 
increasing  from  about  150  in  the  first  image  to  160,000  in  the  last  image. 


1.3  Schrodinger  and  Heisenberg 

In  1925,  Werner  Heisenberg  proposed  a  model  of  quantum  mechanics  based 
on  treating  the  position  and  momentum  of  the  particle  as,  essentially, 
matrices  of  size  ooxoo.  Actually,  Heisenberg  himself  was  not  familiar  with 
the  theory  of  matrices,  which  was  not  a  standard  part  of  the  mathematical 
education  of  physicists  at  the  time.  Nevertheless,  he  had  quantities  of  the 
form  Xjk  and  pjk  (where  j  and  k  each  vary  over  all  integers),  which  we 
can  recognize  as  matrices,  as  well  as  expressions  such  as  XjiPik ,  which 
we  can  recognize  as  a  matrix  product.  After  Heisenberg  explained  his  the¬ 
ory  to  Max  Born,  Born  recognized  the  connection  of  Heisenberg’s  formulas 
to  matrix  theory  and  made  the  matrix  point  of  view  explicit,  in  a  paper 


14 


1.  The  Experimental  Origins  of  Quantum  Mechanics 


coauthored  by  Born  and  his  assistant,  Pascual  Jordan.  Born,  Heisenberg, 
and  Jordan  then  all  published  a  paper  together  elaborating  upon  their  the¬ 
ory.  The  papers  of  Heisenberg,  of  Born  and  Jordan,  and  of  Born,  Heisen¬ 
berg,  and  Jordan  all  appeared  in  1925.  Heisenberg  received  the  1932  Nobel 
Prize  in  physics  (actually  awarded  in  1933)  for  his  work.  Born’s  exclusion 
from  this  prize  was  controversial,  and  may  have  been  influenced  by  Jordan’s 
connections  with  the  Nazi  party  in  Germany.  (Heisenberg’s  own  work  for 
the  Nazis  during  World  War  II  was  also  a  source  of  much  controversy  after 
the  war.)  In  any  case,  Born  was  awarded  the  Nobel  Prize  in  physics  in 
1954  for  his  work  on  the  statistical  interpretation  of  quantum  mechanics 
(Sect.  1.4). 

Meanwhile,  in  1926,  Erwin  Schrodinger  published  four  remarkable  papers 
in  which  he  proposed  a  wave  theory  of  quantum  mechanics,  along  the  lines 
of  the  de  Broglie  hypothesis.  In  these  papers,  Schrodinger  described  how  the 
waves  evolve  over  time  and  showed  that  the  energy  levels  of,  for  example, 
the  hydrogen  atom  could  be  understood  as  eigenvalues  of  a  certain  oper¬ 
ator.  (See  Chap.  18  for  the  computation  for  hydrogen.)  Schrodinger  also 
showed  that  the  Heisenberg-Born- Jordan  matrix  model  could  be  incorpo¬ 
rated  into  the  wave  theory,  thus  showing  that  the  matrix  theory  and  the 
wave  theory  were  equivalent  (see  Sect.  3.8).  This  book  describes  the  math¬ 
ematical  structure  of  quantum  mechanics  in  essentially  the  form  proposed 
by  Schrodinger  in  1926.  Schrodinger  shared  the  1933  Nobel  Prize  in  physics 
with  Paul  Dirac. 


1.4  A  Matter  of  Interpretation 


Although  Schrodinger’s  1926  papers  gave  the  correct  mathematical  descrip¬ 
tion  of  quantum  mechanics  (as  it  is  generally  accepted  today),  he  did  not 
provide  a  widely  accepted  interpretation  of  the  theory.  That  task  fell  to 
Born,  who  in  a  1926  paper  proposed  that  the  “wave  function”  (as  the  wave 
appearing  in  the  Schrodinger  equation  is  generally  called)  should  be  inter¬ 
preted  statistically,  that  is,  as  determining  the  probabilities  for  observations 
of  the  system.  Over  time,  Born’s  statistical  approach  developed  into  the 
Copenhagen  interpretation  of  quantum  mechanics.  Under  this  interpreta¬ 
tion,  the  wave  function  ip  of  the  system  is  not  directly  observable.  Rather, 
ip  merely  determines  the  probability  of  observing  a  particular  result. 


In  particular,  if  ip  is  properly  normalized,  then  the  quantity  \ip(x) 


is 


the  probability  distribution  for  the  position  of  the  particle.  Even  if  ip  itself 
is  spread  out  over  a  large  region  in  space,  any  measurement  of  the  position 
of  the  particle  will  show  that  the  particle  is  located  at  a  single  point ,  just 
as  we  see  for  the  electrons  in  the  two-slit  experiment  in  Fig.  1.4.  Thus,  a 
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measurement  of  a  particle’s  position  does  not  show  the  particle  “smeared 
out”  over  a  large  region  of  space,  even  if  the  wave  function  if  is  smeared 
out  over  a  large  region. 

Consider,  for  example,  how  Born’s  interpretation  of  the  Schrodinger 
equation  would  play  out  in  the  context  of  the  Hitachi  double-slit  exper¬ 
iment  depicted  in  Fig.  1.4.  Born  would  say  that  each  electron  has  a  wave 
function  that  evolves  in  time  according  to  the  Schrodinger  equation  (an 
equation  of  wave  type).  Each  particle’s  wave  function,  then,  will  propa¬ 
gate  through  the  slits  in  a  manner  similar  to  that  pictured  in  Fig.  1.1.  If 
there  is  a  screen  at  the  bottom  of  Fig.  1.1,  then  the  electron  will  hit  the 
screen  at  a  single  point,  even  though  the  wave  function  is  very  spread  out. 
The  wave  function  does  not  determine  where  the  particle  hits  the  screen;  it 
merely  determines  the  probabilities  for  where  the  particle  hits  the  screen.  If 
a  whole  sequence  of  electrons  passes  through  the  slits,  one  after  the  other, 
over  time  a  probability  distribution  will  emerge,  determined  by  the  square 
of  the  magnitude  of  the  wave  function,  which  is  shown  in  Fig.  1.2.  Thus, 
the  probability  distribution  of  electrons,  as  seen  from  a  large  number  of 
electrons  as  in  Fig.  1.4,  shows  wavelike  interference  patterns,  even  though 
each  individual  electron  strikes  the  screen  at  a  single  point. 

It  is  essential  to  the  theory  that  the  wave  function  if(x)  itself  is  not  the 
probability  density  for  the  location  of  the  particle.  Rather,  the  probability 
density  is  \if(x)\2 .  The  difference  is  crucial,  because  probability  densities 
are  intrinsically  positive  and  thus  do  not  exhibit  destructive  interference. 
The  wave  function  itself,  however,  is  complex-valued,  and  the  real  and 
imaginary  parts  of  the  wave  function  take  on  both  positive  and  negative 
values,  which  can  interfere  constructively  or  destructively.  The  part  of  the 
wave  function  passing  through  the  first  slit,  for  example,  can  interfere  with 
the  part  of  the  wave  function  passing  through  the  second  slit.  Only  after 
this  interference  has  taken  place  do  we  take  the  magnitude  squared  of  the 
wave  function  to  obtain  the  probability  distribution,  which  will,  therefore, 
show  the  sorts  of  peaks  and  valleys  we  see  in  Fig.  1.2. 

Born’s  introduction  of  a  probabilistic  element  into  the  interpretation  of 
quantum  mechanics  was — and  to  some  extent  still  is — controversial.  Ein¬ 
stein,  for  example,  is  often  quoted  as  saying  something  along  the  lines  of, 
“God  does  not  play  at  dice  with  the  universe.”  Einstein  expressed  the  same 
sentiment  in  various  ways  over  the  years.  His  earliest  known  statement  to 
this  effect  was  in  a  letter  to  Born  in  December  1926,  in  which  he  said, 

Quantum  mechanics  is  certainly  imposing.  But  an  inner  voice 
tells  me  that  it  is  not  yet  the  real  thing.  The  theory  says  a  lot, 
but  does  not  really  bring  us  any  closer  to  the  secret  of  the  “old 
one.”  I,  at  any  rate,  am  convinced  that  He  does  not  throw  dice. 

Many  other  physicists  and  philosophers  have  questioned  the  probabilistic 
interpretation  of  quantum  mechanics,  and  have  sought  alternatives,  such 
as  “hidden  variable”  theories.  Nevertheless,  the  Copenhagen  interpretation 
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of  quantum  mechanics,  essentially  as  proposed  by  Born  in  1926,  remains 
the  standard  one.  This  book  resolutely  avoids  all  controversies  surround¬ 
ing  the  interpretation  of  quantum  mechanics.  Chapter  3,  for  example, 
presents  the  standard  statistical  interpretation  of  the  theory  without  ques¬ 
tion.  The  book  may  nevertheless  be  of  use  to  the  more  philosophically 
minded  reader,  in  that  one  must  learn  something  of  quantum  mechanics 
before  delving  into  the  (often  highly  technical)  discussions  about  its  inter¬ 
pretation. 


1.5  Exercises 


1.  Beginning  with  the  formula  for  the  sum  of  a  geometric  series,  use 
differentiation  to  obtain  the  identity 


oo 


E 


ne 


—  An 


A 


n- 


-0 


(i 


-A\2 


2.  In  Planck’s  model  of  blackbody  radiation,  the  energy  in  a  given  fre¬ 
quency  ix)  of  electromagnetic  radiation  is  distributed  randomly  over 
all  numbers  of  the  form  nhuj ,  where  n  =  0,1,2,....  Specifically,  the 
likelihood  of  finding  energy  nhuj  is  postulated  to  be 

p(E  =  nhuj)  =  —e-Pnhu  ? 

Zj 


\  _  £—/3Huj 

where  Z  is  a  normalization  constant,  which  is  chosen  so  that  the  sum 
over  n  of  the  probabilities  is  1.  Here  [3  =  1  /(fc^T),  where  T  is  the 
temperature  and  ks  is  Boltzmann’s  constant.  The  expected  value  of 
the  energy,  denoted  (E\  is  defined  to  be 

1  oo 

(E)  =  -V'(n^)e-/3n^. 

Zj  ‘  ^ 

n— 0 

(a)  Using  Exercise  1,  show  that 

huj 

g/3hx> _ 2.  * 

(b)  Show  that  (E)  behaves  like  1//3  =  ksT  for  small  a;,  but  that 
(E)  decays  exponentially  as  uj  tends  to  infinity. 

Note:  In  applying  the  above  calculation  to  blackbody  radiation,  one 
must  also  take  into  account  the  number  of  modes  having  frequency 
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in  a  given  range,  say  between  ujq  and  ujo  +  e.  The  exact  number  of 
such  frequencies  depends  on  the  shape  of  the  cavity,  but  according  to 
Weyl’s  law,  this  number  will  be  approximately  proportional  to  sluq  for 
large  values  of  coq.  Thus,  the  amount  of  energy  per  unit  of  frequency  is 


C 


hu3 


g/3hu  _  2  ’ 


(1.7) 


where  C  is  a  constant  involving  the  volume  of  the  cavity  and  the 
speed  of  light.  The  relation  (1.7)  is  known  as  Planck’s  law. 

3.  In  classical  mechanics,  the  kinetic  energy  of  an  electron  is  mev2 / 2, 
where  v  is  the  magnitude  of  the  velocity.  Meanwhile,  the  potential 
energy  associated  with  the  force  law  (1.3)  is  V(r)  =  — Q2/r,  since 
dV/dr  =  F.  Show  that  if  the  particle  is  moving  in  a  circular  orbit 
with  radius  rn  given  by  (1.5),  then  the  total  energy  (kinetic  plus 
potential)  of  the  particle  is  Eni  as  given  in  (1.1). 


2 

A  First  Approach  to  Classical 
Mechanics 


2.1  Motion  in  R1 


2.1.1  Newton’s  law 

We  begin  by  considering  the  motion  of  a  single  particle  in  R1,  which  may 
be  thought  of  as  a  particle  sliding  along  a  wire,  or  a  particle  with  motion 
that  just  happens  to  lie  in  a  line.  We  let  x(t)  denote  the  particle’s  position 
as  a  function  of  time.  The  particle’s  velocity  is  then 

v(t)  :=  x(t), 

where  we  use  a  dot  over  a  symbol  to  denote  the  derivative  of  that  quantity 
with  respect  to  the  time  t. 

The  particle’s  acceleration  is  then 

a(t)  =  v(t)  =  x(t), 

where  x  denotes  the  second  derivative  of  x  with  respect  to  t.  We  assume 
that  there  is  a  force  acting  on  the  particle  and  we  assume  at  first  that  the 
force  F  is  a  function  of  the  particle’s  position  only.  (Later,  we  will  look  at 
the  case  of  forces  that  depend  also  on  velocity.) 

Under  these  assumptions,  Newton’s  second  law  (F  =  ma)  takes  the  form 

F(x(t ))  =  ma  =  mx(t),  (2.1) 

where  m  is  the  mass  of  the  particle,  which  is  assumed  to  be  positive.  We  will 
henceforth  abbreviate  Newton’s  second  law  as  simply  “Newton’s  law,”  since 

B.C.  Hall,  Quantum  Theory  for  Mathematicians ,  Graduate  Texts  19 

in  Mathematics  267,  DOI  10.1007/978-l-4614-7116-5_2, 

©  Springer  Science+Business  Media  New  York  2013 


20 


2.  A  First  Approach  to  Classical  Mechanics 


we  will  use  the  second  law  much  more  frequently  than  the  others.  Since 
(2.1)  is  of  second  order,  the  appropriate  initial  conditions  (needed  to  get 
a  unique  solution)  are  the  position  and  velocity  at  some  initial  time  to.  So 
we  look  for  solutions  of  (2.1)  subject  to 

x(t0)  =  x0 

x(tQ)  =  v0. 

Assuming  that  F  is  a  smooth  function,  standard  results  from  the  ele¬ 
mentary  theory  of  differential  equations  tell  us  that  there  exists  a  unique 
local  solution  to  (2.1)  for  each  pair  of  initial  conditions.  (A  local  solution 
is  one  defined  for  t  in  a  neighborhood  of  the  initial  time  to .)  Since  (2.1)  is 
in  general  a  nonlinear  equation,  one  cannot  expect  that,  for  a  general  force 
function  F,  the  solutions  will  exist  for  all  t.  If,  for  example,  F(x)  =  x2 ,  then 
any  solution  with  positive  initial  position  and  positive  initial  velocity  will 
escape  to  infinity  in  finite  time.  (Apply  Exercise  4  with  V(x)  =  —x3/3.) 
For  a  proof  existence  and  uniqueness,  see  Example  8.2  and  Theorem  8.13 
in  [28]. 

Definition  2.1  A  solution  x(t)  to  Newton’s  law  is  called  a  trajectory. 

Example  2.2  (Harmonic  Oscillator)  If  the  force  is  given  by  Hooke’s 
law,  F(x)  =  —kx,  where  k  is  a  positive  constant,  then  Newton’s  law  can  be 
written  as  mx  +  kx  =  0.  The  general  solution  of  this  equation  is 

x(t)  =  acos(cet)  +  bsm(ujt), 
where  uj  :=  ^ k/m  is  the  frequency  of  oscillation. 

The  system  in  Example  2.2  is  referred  to  as  a  (classical)  harmonic  os¬ 
cillator.  This  system  can  describe  a  mass  on  a  spring,  where  the  force  is 
proportional  to  the  distance  x  that  the  spring  is  stretched  from  its  equi¬ 
librium  position.  The  minus  sign  in  —  kx  indicates  that  the  force  pulls  the 
oscillator  back  toward  equilibrium.  Here  and  elsewhere  in  the  book,  we 
use  the  “angular”  notion  of  frequency,  which  is  the  rate  of  change  of  the 
argument  of  a  sine  or  cosine  function.  If  uj  is  the  angular  frequency,  then 
the  “ordinary”  frequency — i.e.,  the  number  of  cycles  per  unit  of  time — is 
uj/2tt.  Saying  that  x  has  (angular)  frequency  uj  means  that  x  is  periodic 
with  period  2tt /uj. 

2.1.2  Conservation  of  Energy 

We  return  now  to  the  case  of  a  general  force  function  F{x).  We  define 
the  kinetic  energy  of  the  system  to  be  \mv2 .  We  also  define  the  potential 
energy  of  the  system  as  the  function 


V(x) 


F(x)  dx, 


(2.2) 
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so  that  F(x)  =  —dV/dx.  (The  potential  energy  is  defined  only  up  to  adding 
a  constant.)  The  total  energy  E  of  the  system  is  then 

l 

E{x,  v)  =  -mv2  +  V(x).  (2-3) 

The  chief  significance  of  the  energy  function  is  that  it  is  conserved ,  meaning 
that  its  value  along  any  trajectory  is  constant. 

Theorem  2.3  Suppose  a  particle  satisfies  Newton’s  law  in  the  form  mx  = 
F(x).  LetV  and  E  be  as  in  (2.2)  and  (2.3).  Then  the  energy  E  is  conserved, 
meaning  that  for  each  solution  x(t)  of  Newton’s  law,  E(x(t),x(t))  is  inde¬ 
pendent  oft. 


Proof.  We  verify  this  by  differentiation,  using  the  chain  rule: 


^E(x(t),x(t)) 


ft  Q™ev))2 +v(x(t)) 

dV 

mx{t)x(t)  H — — £(£) 

dx 

x(t)[mx(t)  —  F(x(t))]. 


This  last  expression  is  zero  by  Newton’s  law.  Thus,  the  time-derivative  of 
the  energy  along  any  trajectory  is  zero,  so  E(x(t),x(t))  is  independent  of 
t,  as  claimed.  ■ 

We  may  call  the  energy  a  conserved  quantity  (or  constant  of  motion ), 
since  the  particle  neither  gains  nor  loses  energy  as  the  particle  moves 
according  to  Newton’s  law. 

Let  us  see  how  conservation  of  energy  helps  us  understand  the  solution 
to  Newton’s  law.  We  may  reduce  the  second-order  equation  mx  =  F(x)  to 
a  pair  of  first-order  equations,  simply  by  introducing  the  velocity  v  as  a 
new  variable.  That  is,  we  look  for  pairs  of  functions  (x(t),  v(t))  that  satisfy 
the  following  system  of  equations 


dx 

dt 

dv 

dt 


v(t) 

—F(x(t)). 

m 


If  (x(t),  v(t))  is  a  solution  to  this  system,  then  we  can  immediately  see  that 
x(t)  satisfies  Newton’s  law,  just  by  substituting  dx/dt  for  v  in  the  second 
equation.  We  refer  to  the  set  of  possible  pairs  of  the  form  (x,v)  (i.e.,  M2) 
as  the  phase  space  of  the  particle  in  R1.  The  appropriate  initial  conditions 
for  this  first-order  system  are  x(0)  =  xq  and  v(0)  =  v$. 

Once  we  are  working  in  phase  space,  we  can  use  the  conservation  of 
energy  to  help  us.  Conservation  of  energy  means  that  each  solution  to 
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the  system  (2.4)  must  lie  entirely  on  a  single  “level  curve”  of  the  energy 
function,  that  is,  the  set 

{  (x,  v)  G  M2 1  E(x,  v )  =  E(x o,  no)}  .  (2.5) 

If  F — and  therefore  also  V — is  smooth,  then  E  is  a  smooth  function  of  x 
and  v.  Then  as  long  as  (2.5)  contains  no  critical  points  of  E ,  this  set  will 
be  a  smooth  curve  in  M2,  by  the  implicit  function  theorem.  If  the  level  set 
(2.5)  is  also  a  simple  closed  curve,  then  the  solutions  of  (2.5)  will  simply 
wind  around  and  around  this  curve.  Thus,  the  set  that  the  solutions  to  (2.5) 
trace  out  in  phase  space  can  be  determined  simply  from  the  conservation 
of  energy.  The  only  thing  not  apparent  at  the  moment  is  how  this  curve  is 
parameterized  as  a  function  of  time. 

In  mechanics,  a  conserved  quantity — such  as  the  energy  in  the  one¬ 
dimensional  version  of  Newton’s  law — is  often  referred  to  as  an  “integral 
of  motion.”  The  reason  for  this  is  that  although  Newton’s  second  law  is  a 
second-order  equation  in  x,  the  energy  depends  only  on  x  and  x  and  not 
on  x.  Thus,  the  equation 

j(x(t))2  +V(x(t))  =  E0, 

where  E0  is  the  value  of  the  energy  at  time  to,  is  actually  a  first-order 
differential  equation.  We  can  solve  for  x  to  put  this  equation  into  a  more 
standard  form: 

=  (2.6) 
V  m 

What  this  means  is  that  by  using  conservation  of  energy  we  have  turned  the 
original  second-order  equation  into  a  first-order  equation.  We  have  therefore 
“integrated”  the  original  equation  once,  that  is,  changed  an  equation  of 
the  form  x(t)  =  •  •  •  into  an  equation  of  the  form  x(t)  =  •  •  •  .  The  first- 
order  equation  (2.6)  is  separable  and  can  be  solved  more-or-less  explicitly 
(Exercise  1). 

2.1.3  Systems  with  Damping 

Up  to  now,  we  have  considered  forces  that  depend  only  on  position.  It  is 
common,  however,  to  consider  forces  that  depend  on  the  velocity  as  well 
as  the  position.  In  the  case  of  a  damped  harmonic  oscillator,  for  example, 
one  typically  assumes  that  there  is,  in  addition  to  the  force  of  the  spring, 
a  damping  force  (friction,  say)  that  is  proportional  to  the  velocity.  Thus, 
F  =  —kx  —  7±,  where  k  is,  as  before,  the  spring  constant  and  where  7  >  0 
is  the  damping  constant.  The  minus  sign  in  front  of  yx  reflects  that  the 
damping  force  operates  in  the  opposite  direction  to  the  velocity,  causing 
the  particle  to  slow  down.  The  equation  of  motion  for  such  a  system  is  then 


mx  +  yx  +  kx  =  0. 
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If  7  is  small,  the  solutions  to  this  equation  display  decaying  oscillation, 
meaning  sines  and  cosines  multiplied  by  a  decaying  exponential;  if  7  is 
large,  the  solutions  are  pure  decaying  exponentials  (Exercise  5). 

In  the  case  of  the  damped  harmonic  oscillator,  there  is  no  longer  a 
conserved  energy.  Specifically,  there  is  no  nonconstant  continuous  func¬ 
tion  E  on  M2  such  that  E(x(t),x(t))  is  independent  of  t  for  all  solutions  of 
Newton’s  law.  To  see  this,  we  simply  observe  that  for  7  >  0,  all  solutions 
x(t)  have  the  property  that  (x(t),x(t))  tends  to  the  origin  in  the  plane  as  t 
tends  to  infinity.  Thus,  if  E  is  continuous  and  constant  along  each  trajec¬ 
tory,  the  value  of  E  at  the  starting  point  has  to  be  the  same  as  the  value 
at  the  origin. 

We  now  consider  a  general  system  with  damping. 


Proposition  2.4  Suppose  a  particle  moves  in  the  presence  of  a  force  law 
given  by  F(x,x)  =  F\(x)  —  7#,  with  7  >  0.  Define  the  energy  E  of  the 
system  by 


E{x,  x) 


1 

2 


mx2  +  V(x) 


•> 


where  dV/dx  =  —F\(x).  Then  along  any  trajectory  x(t),  we  have 


^E(x(t),x(t))  =  -7 x(t)2  <  0. 

Thus,  although  the  energy  is  not  conserved,  it  is  decreasing  with  time, 
which  gives  us  some  information  about  the  behavior  of  the  system. 

Proof.  We  differentiate  as  in  the  proof  of  Theorem  2.3,  except  that  now 
dV/dx  =  —Fi(x): 

d 

—E(x(t),x(t))  =  x(t)[mx(t)  —  Fi(x(t))]. 

Since  F\  is  not  the  full  force  function,  the  quantity  in  square  brackets  equals 
not  zero  but  — jx.  Thus,  dE/dt  =  — ytf2.  ■ 

We  can  interpret  Proposition  2.4  as  saying  that  in  the  presence  of  friction, 
the  system  we  are  studying  gives  up  some  of  its  energy  to  heat  energy  in 
the  environment,  so  that  the  energy  of  our  system  decreases  with  time. 
We  will  see  that  in  higher  dimensions,  it  is  possible  to  have  conservation 
of  energy  in  the  presence  of  velocity-dependent  forces,  provided  that  these 
forces  act  perpendicularly  to  the  velocity. 


2.2  Motion  in  R 


in 


We  now  consider  a  particle  moving  in  Mn.  The  position  x  =  (aq, . . .  ,tcn) 
of  a  particle  is  now  a  vector  in  Mn,  as  is  the  velocity  v  and  acceleration  a. 
We  let 


X  —  (^1?  •  •  •  1  %n) 
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denote  the  derivative  of  x  with  respect  to  t  and  we  let  x  denote  the  second 
derivative  of  x  with  respect  to  t.  Newton’s  law  now  takes  the  form 


mx(t)  =  F  (x(t),x(t)), 


where  F  :  Mn  x  Mn  Mn  is  some  force  law,  which  in  general  may  depend 
on  both  the  position  and  velocity  of  the  particle. 

We  begin  by  considering  forces  that  are  independent  of  velocity,  and  we 
look  for  a  conserved  energy  function  in  this  setting. 


Proposition  2.5  Consider  Newton’s  law  (2.7)  in  the  case  of  a  velocity- 
independent  force:  mx(t)  =  F(x(t)).  Then  an  energy  function  of  the  form 


1 

E(x,x)  =  2m 


X 


+  C(x) 


is  conserved  if  and  only  if  V  satisfies 


-W  =  F, 


where  W  is  the  gradient  of  V. 


Saying  that  E  is  “conserved”  means  that  F?(x(t),  x(t))  is  independent  of 
t  for  each  solution  x(t)  of  Newton’s  law.  The  function  V  is  the  potential 
energy  of  the  system. 

Proof.  Differentiating  gives 


d 

dt 


1 

-m 


XWI  +C(x(f)) 


n 


n 


dV 


j=i  .7  =  1  3 

=  x(t)  •  [mx(t)  +  VP] 

=  x(£)  •  [F(x)  +  VV(x) 


Thus,  dE/dt  will  always  be  equal  to  zero  if  and  only  if  we  have 


— W(x)  =  F(x) 


for  all  x.  ■ 

We  now  encounter  something  that  did  not  occur  in  the  one-dimensional 
case.  In  R1,  any  smooth  function  can  be  expressed  as  the  derivative  of  some 
other  function.  In  Mn,  however,  not  every  vector- valued  function  F(x)  can 
be  expressed  as  the  (negative  of)  the  gradient  of  some  scalar- valued  function 
V. 

Definition  2.6  Suppose  F  is  a  smooth,  Mn -valued  function  on  a  domain 
U  C  Mn.  Then  F  is  called  conservative  if  there  exists  a  smooth,  real-valued 
function  V  on  U  such  that  F  =  —  W. 

If  the  domain  U  is  simply  connected,  then  there  is  a  simple  local  condition 
that  characterizes  conservative  functions. 
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Proposition  2.7  Suppose  U  is  a  simply  connected  domain  in  Mn  and  F 
is  a  smooth,  Mn -valued  function  on  U.  Then  F  is  conservative  if  and  only 
if  F  satisfies 


dF3  dFk 
dxk  dxj 


(2.8) 


at  each  point  in  U. 


When  n  =  3,  it  is  easy  to  check  that  the  condition  (2.8)  is  equivalent 
to  the  curl  V  x  F  of  F  being  zero  on  U.  The  hypothesis  that  U  be  simply 
connected  cannot  be  omitted;  see  Exercise  7. 

Proof.  If  F  is  conservative,  then 


dFj  d2V  d2V  dFk 

dxk  dxkdxj  dxjdxk  dxj 


at  every  point  in  U.  In  the  other  direction,  if  F  satisfies  (2.8),  V  can  be 
obtained  by  integrating  F  along  paths  and  using  the  Stokes  theorem  to 
establish  independence  of  choice  of  path.  See,  for  example,  Theorem  4.3  on 
p.  549  of  [44]  for  a  proof  in  the  n  =  3  case.  The  proof  in  higher  dimensions 
is  the  same,  provided  one  knows  the  general  version  of  the  Stokes  theorem. 


We  may  also  consider  velocity-dependent  forces.  If,  for  example,  F(x,  v) 
=  — yv  +  Fi(x),  where  7  is  a  positive  constant,  then  we  will  again  have 
energy  that  is  decreasing  with  time.  There  is  another  new  phenomenon, 
however,  in  dimension  greater  than  1,  namely  the  possibility  of  having  a 
conserved  energy  even  when  the  force  depends  on  velocity. 

Proposition  2.8  Suppose  a  particle  in  Mn  moves  in  the  presence  of  a  force 
F  of  the  form 

F(x,  v)  =  -  W  (x)  +  F2(x,  v), 
where  V  is  a  smooth  function  and  where  F 2  satisfies 


v  •  F2(x,  v)  =  0 


(2.9) 


for  all  x  and  v  in  Mn.  Then  the  energy  function  E(x,  v) 
is  constant  along  each  trajectory. 


V 


W(x) 


If,  for  example,  F2  is  the  force  exerted  on  a  charged  particle  in  M3  by  a 
magnetic  field  B(x),  then 


F2(x,  v)  =  qv  x  B(x), 

where  q  is  the  charge  of  the  particle,  which  clearly  satisfies  (2.9). 
Proof.  See  Exercise  8.  ■ 
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2.3  Systems  of  Particles 


If  we  have  a  system  if  N  particles,  each  moving  in  Mn,  then  we  denote  the 
position  of  the  jth  particle  by 

—  (Tj  A 

y  tAy  ^  •  •  •  •)  tAy  ^  J  • 


Thus,  in  the  expression  xJk,  the  superscript  j  indicates  the  jth  particle,  while 
the  subscript  k  indicates  the  kth  component.  Newton’s  law  then  takes  the 
form 


••  7 


FJ(x; 


N  •  ] 

X  ,  X 


X 


N 


),  j  =  1,2, ...  ,N, 


where  rrij  is  the  mass  of  the  j th  particle.  Here,  FJ  is  the  force  on  the  j th 
particle,  which  in  general  will  depend  on  the  position  and  velocity  not  only 
of  that  particle,  but  also  on  the  position  and  velocity  of  the  other  particles. 


2. 3. 1  Conservation  of  Energy 


In  a  system  of  particles,  we  cannot  expect  that  the  energy  of  each  individ¬ 
ual  particle  will  be  conserved,  because  as  the  particles  interact,  they  can 
exchange  energy.  Rather,  we  should  expect  that,  under  suitable  assump¬ 
tions  on  the  forces  F-7,  we  can  define  a  conserved  energy  function  for  the 
whole  system  (the  total  energy  of  the  system). 

Let  us  consider  forces  depending  only  on  the  position  of  the  particles, 
and  let  us  assume  that  the  energy  function  will  be  of  the  form 


(2.10) 


We  will  now  try  to  see  what  form  for  V  (if  any)  will  allow  E  to  be  constant 
along  each  trajectory. 


Proposition  2.9  An  energy  function  of  the  form  (2.10)  is  constant  along 
each  trajectory  if 

X7jV  =  -Fj  (2.11) 

for  each  j,  where  VJ  is  the  gradient  with  respect  to  the  variable  xJ . 


Proof.  We  compute  that 

dE 

dt 


N 

\jnj~k)  •  x-7  +  •  x 

3  = 1 

N 

m,x-?  +  VJV] 

3  = 1 

N 

!>’■ 

Fj  +VjV]  . 

3  = 1 
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If  V-W  =  — FJ,  then  E  will  be  conserved.  ■ 

As  in  the  one-particle  case,  there  is  a  simple  condition  for  the  existence 
of  a  potential  function  V  satisfying  (2.11). 

Proposition  2.10  Suppose  a  force  function  F  =  (F1, . . .  ,FiV)  is  defined 
on  a  simply  connected  domain  U  in  MniV.  Then  there  exists  a  smooth 
function  V  on  U  satisfying 


VJV  =  -Fj 


for  all  j  if  and  only  if  we  have 


9Fj  =  dFlm 
dxlm  dx{ 


(2.12) 


for  all  j,  fc,  Z,  and  m. 

Proof.  Apply  Proposition  2.7  with  n  replaced  by  nN  and  with  j  and  k 
replaced  by  the  pairs  (j,  k)  and  (/,  m).  ■ 

2.3.2  Conservation  of  Momentum 

We  now  introduce  the  notion  of  the  momentum  of  a  particle. 

Definition  2.11  In  an  N -particle  system,  the  momentum  of  the  jth 
particle,  denoted  pJ,  is  the  product  of  the  mass  and  the  velocity  of  that 
particle: 

pJ  =  mjXp . 

The  total  momentum  of  the  system,  denoted  p,  is  defined  as 

N 

p  =  Epj- 

3  = 1 

Observe  that 

- =  770,  XJ  =  FA 

dt  0 

Thus,  Newton’s  law  may  be  reformulated  as  saying,  “The  force  is  the  rate 
of  change  of  the  momentum.”  This  is  how  Newton  originally  formulated 
his  second  law. 

Newton’s  third  law  says,  “For  every  action,  there  is  an  equal  and  opposite 
reaction.”  This  law  will  apply  if  all  forces  are  of  the  “two-particle”  variety 
and  satisfy  a  natural  symmetry  property.  Having  two-particle  forces  means 
that  the  force  FJ  on  the  jth  particle  is  a  sum  of  terms  FJ,fc,  j  fc,  where 
FJ,/e  depends  only  xJ  and  xfc.  The  relevant  symmetry  property  is  that 
FJ,/c(xJ, xA:)  =  —  Fkd (xfc, xJ);  that  is,  the  force  exerted  by  the  jth  particle 
on  the  fcth  particle  is  the  negative  (i.e.,  “equal  and  opposite”)  of  the  force 
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exerted  by  the  kth  particle  on  the  j th  particle.  If  the  forces  are  assumed 
also  to  be  conservative,  then  the  potential  energy  of  the  system  will  be  of 
the  form 

V (x1 ,  x2 , . . . ,  x^)  =  Y,  Wfc(y  -xfe).  (2.13) 

j<k 

One  important  consequence  of  Newton’s  third  law  is  conservation  of  the 
total  momentum  of  the  system. 

Proposition  2.12  Suppose  that  for  each  j,  the  force  on  the  jth  particle  is 
of  the  form 

Fj(x1,x2,...,xN)  =  ^Fj’fc(xJ',xfc), 

k^j 

for  certain  functions  FJ,/c.  Suppose  also  that  we  have  the  “ equal  and 
opposite ”  condition 

Fj’k(xj,x.k )  =  -Ffcj'(xj',xfc). 


Then  the  total  momentum  of  the  system  is  conserved. 

Note  that  since  the  rate  of  change  of  pJ  is  F-7,  the  force  on  the  j  th 
particle,  the  momentum  of  each  individual  particle  is  not  constant  in  time, 
except  in  the  trivial  case  of  a  noninteracting  system  (one  in  which  all  forces 
are  zero). 

Proof.  Differentiating  gives 

n  N  a  i  N 

f  =  E|=Ef'=EEf‘V,x‘). 

3  = 1  3  =  1  3  k^j 


By  the  equal  and  opposite  condition,  FJ,fc(xJ ,  xfc)  cancels  with  Fk ,J  (x-7,  xfc), 
so  dp/dt  =  0.  ■ 

Let  us  consider,  now,  a  more  general  situation  in  which  we  have  con¬ 
servative  forces,  but  not  necessarily  of  the  “two-particle”  form.  It  is  still 
possible  to  have  conservation  of  momentum,  as  the  following  result  shows. 


Proposition  2.13  If  a  multiparticle  system  has  a  force  law  coming  from 
a  potential  V,  then  the  total  momentum  of  the  system  is  conserved  if  and 
only  if 

V (x1  +  a,  x2  +  a, . . . ,  x^  +  a)  =  V (x1,  x2, . . . ,  x^)  (2.14) 

for  all  a  E  Mn. 


Proof.  Apply  (2.14)  with  a  =  where  is  the  vector  with  a  1  in  the 
kth  spot  and  zeros  elsewhere.  Differentiating  with  respect  to  t  at  t  =  0 
gives 


N 

°  =  E 


3  = 1 


3  =  1 


<pk 

dt 
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where  pk  is  the  fcth  component  of  the  total  momentum  p.  Thus,  if  (2.14) 
holds,  p  is  constant  in  time. 

Conversely,  if  the  momentum  is  conserved,  then  the  sum  of  the  forces  is 
zero  at  every  point,  and  so 

d 

—  V  (x1  +  ta,  x2  +  ta, . . . ,  ~xN  +  ta) 

dt 

N 

=  V*7'  V  (x1  +  ta,  x2  +  ta, . . . ,  ~xN  +  ta)  •  a 

3  = 1 

(  N  \ 

=  —  I  FJ  (x1  +  ta,  x2  +  ta, . . . ,  ~xN  +  ta)  I  •  a 

=  0 

for  all  t.  Thus,  the  value  of  the  quantity  being  differentiated  is  the  same  at 
t  =  0  as  at  t  =  1,  which  establishes  (2.14).  ■ 

The  moral  of  the  story  is  that  conservation  of  momentum  is  a  consequence 
of  translation-invariance  of  the  system,  where  “translation  invariance  ” 
means  invariance  under  simultaneous  translations  of  every  particle  by  the 
same  amount.  (See  Exercise  11  for  a  more  general  version  of  this  result.) 
If  the  potential  is  of  the  “two-particle”  form  (2.13),  then  it  is  evident  that 
the  condition  (2.14)  is  satisfied. 


2. 3. 3  Center  of  Mass 

We  now  consider  an  important  application  of  momentum  conservation. 

Definition  2.14  For  a  system  of  N  particles  moving  in  Mn,  the  center 
of  mass  of  the  system  at  a  fixed  time  is  the  vector  c  E  Mn  given  by 


c  = 


N 


E 


mj  • 

— -x.J 

M 


i 


where  M  =  ^2f=1  rnj  is  the  total  mass  of  the  system. 


o=i  "bo 


The  center  of  mass  is  a  weighted  average  of  the  positions  of  the  various 
particles.  Differentiating  c (t)  with  respect  to  t  gives 


dc 


dt  M 


1  N 

-Y 


1=1 


•  7  1 

=  M' 


(2.15) 


where  p  is  the  total  momentum. 
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Proposition  2.15  Suppose  the  total  momentum  p  of  a  system  is  conserved. 
Then  the  center  of  mass  moves  in  a  straight  line  at  constant  speed. 
Specifically, 

c(t)  =c(t0)  +  (t-t0)jj, 

where  c(to)  is  the  center  of  mass  at  some  initial  time  to. 

Proof.  The  result  follows  easily  from  (2.15).  ■ 

The  notion  of  center  of  mass  is  particularly  useful  in  a  system  of  two 
particles  in  which  momentum  is  conserved.  For  a  system  of  two  particles,  if 
the  potential  energy  ^/(x^x2)  is  invariant  under  simultaneous  translations 
of  x1  and  x2,  then  it  is  of  the  form 

V^x1,  x2)  =  ^(x1  —  x2), 


where  V(a)  =  P(a,  0). 

Now,  the  positions  x1 ,  x2  of  the  particles  can  be  recovered  from  knowledge 
of  the  center  of  mass  and  the  relative  position 


as  follows: 


c  +  m2y 

mi  +  m2 
c  —  miy 

mi  +  m2 

Meanwhile,  we  may  compute  that 

y (t)  =  x1  -  x2  =  — — VVYx1  -  x2)  -  2_  VCfx1  -  x2). 

mi  m2 

This  calculation  gives  the  following  result. 


Proposition  2.16  For  a  two-particle  system  with  potential  energy  of  the 
form  P(x1,x2)  =  V  ('x1  —  x2),  the  relative  position  y  :=  x1  —  x2  satisfies 
the  differential  equation 

W  =  -VV-(y), 

where  p  is  the  reduced  mass  given  by 


l 

mi 


mim2 
mi  +  m2 


Thus,  when  the  total  momentum  of  a  two-particle  system  is  conserved, 
the  relative  position  evolves  as  a  one-particle  system  with  “effective”  mass  /i, 
while  the  center  of  mass  moves  “trivially,”  as  described  in  Proposition  2.15. 
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FIGURE  2.1.  A(t)  is  the  area  of  the  shaded  region. 


2.4  Angular  Momentum 

We  start  by  considering  angular  momentum  in  the  simplest  nontrivial  case, 
motion  in  M2. 


Definition  2.17  Consider  a  particle  moving  in  M2,  having  position  x, 
velocity  v,  and  momentum  p  =  rav.  Then  the  angular  momentum  of 
the  particle,  denoted  J,  is  given  by 


J  =  xip2  ~  X2Pl- 


(2.16) 


In  more  geometric  terms,  J 


x 


p|  sin  </>,  where  <f>  is  the  angle  (measured 


counterclockwise)  between  x  and  p.  We  can  look  at  J  in  yet  another  way 
as  follows.  If  6  is  the  usual  angle  in  polar  coordinates  on  M2,  then  an 
elementary  calculation  (Exercise  9)  shows  that 


It  then  follows  that 


J 


dt 


J  =  2  m 


dA 
dt  ’ 


(2.17) 


(2.18) 


where  A  =  (1/2)  Jr2  dO  is  the  area  being  swept  out  by  the  curve  x(£). 
See  Fig.  2.1. 

One  significant  property  of  the  angular  momentum  is  that  it  (like  the 
energy)  is  conserved  in  certain  situations. 


Proposition  2.18  Suppose  a  particle  of  mass  m  is  moving  in  M2  under 
the  influence  of  a  conservative  force  with  the  potential  function  V(x).  If 
V  is  invariant  under  rotations  in  M2,  then  the  angular  momentum  J  = 
X1P2  —  X2P1  is  independent  of  time  along  any  solution  of  Newton’s  equation. 
Conversely,  if  J  is  independent  of  time  along  every  solution  of  Newton’s 
equation,  then  V  is  invariant  under  rotations. 
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Proof.  Differentiating  (2.16)  along  a  solution  of  Newton’s  law  gives 


dJ 

dt 


dxi  dpo 

-P2+X1 


d,X‘ 


dt 

1 


m 


■P1P2  ~  xi 


X2 


dV 

dx\ 


X\ 


dt 

dV 

0X2 

dV 

dxo 


-P 1  -  X2 


dp  i 


dt  dt 

1  dV 

- P2Pl  +  X2~ — 

m  ox  1 


On  the  other  hand,  consider  rotations  Rq  in  M2  given  by 


Re 


cos  0  —  sin  0 

sin  0  cos  0 


If  we  differentiate  V  along  this  family  of  rotations,  we  obtain 


0=0 


dV  dx 
dx  dO 


dV  dy 
dy  dO 


dV 


+  X\ 


dV 

dx2 


Thus,  the  angular  derivative  of  V  is  zero  if  and  only  if  J  is  constant.  ■ 
Conservation  of  J  [together  with  the  relation  (2.18)]  gives  the  following 
result. 


Corollary  2.19  (Kepler’s  Second  Law)  Suppose  a  particle  is  moving 
in  M2  in  the  presence  of  a  force  associated  with  a  rotationally  invariant 
potential.  7/x(t)  is  the  trajectory  of  the  particle,  then  the  area  swept  out  by 
x(£)  between  times  t  =  a  and  t  =  b  is  ( b  —  a)J / (2m),  where  J  is  the  constant 
value  of  the  angular  momentum  along  the  trajectory.  Since  the  area  swept 
out  depends  only  on  b  —  a,  we  may  say  that  Uequal  areas  are  swept  out  in 
equal  times. ;; 

Kepler,  of  course,  was  interested  in  the  motion  of  planets  in  M3,  not  in 
M2.  The  motion  of  a  planet  moving  in  the  “inverse  square”  force  of  a  sun 
will,  however,  always  lie  in  a  plane.  (This  claim  follows  from  the  three- 
dimensional  version  of  conservation  of  angular  momentum,  as  explained  in 
Sect.  2.6.1.) 

In  M3,  the  angular  momentum  of  the  particle  is  a  vector,  given  by 

J  =  x  x  p,  (2.19) 

where  x  denotes  the  cross  product  (or  vector  product).  Thus,  for  example, 

J3  =  X1P2  -  X2P1.  (2.20) 

If,  then,  we  have  a  particle  in  M3  that  just  happens  to  be  moving  in  M2 
(i.e.,  £3  =  0  and  £>3  =  0),  then  the  angular  momentum  will  be  in  the  z- 
direction  with  z-component  given  by  the  quantity  J  defined  in 
Definition  2.17. 


2.5  Poisson  Brackets  and  Hamiltonian  Mechanics 
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The  representation  of  the  angular  momentum  of  a  particle  in  M3  as  a 
vector  is  a  low-dimensional  peculiarity.  For  a  particle  in  Mn,  the  angular 
momentum  is  a  skew-symmetric  matrix  given  by 

Jjk  %jPk  x  k  Pj  •  (2.21) 

In  the  M3  case,  the  entries  of  the  3x3  angular  momentum  matrix  are  made 
up  by  the  three  components  of  the  angular  momentum  vector  together  with 
their  negatives,  with  zeros  along  the  diagonal.  [Compare,  e.g.,  (2.20)  and 
(2.21).] 

Definition  2.20  For  a  system  of  N  particles  moving  in  Mn,  the  total 
angular  momentum  of  the  system  is  the  skew- symmetric  matrix  J  given 
by 

N 

Jjk  =  ( xjPk  ~  XkPj )  •  (2.22) 

1=1 

Theorem  2.21  Suppose  a  system  of  N  particles  in  Mn  is  moving  under 
the  influence  of  conservative  forces  with  potential  function  V.  If  V  satisfies 

V (Rx1,  Rx2 , . . . ,  Rxn)  =  V (x1,  x2, . . . ,  x^)  (2.23) 

for  every  rotation  matrix  R ,  then  the  total  angular  momentum  of  the  system 
is  conserved  (constant  along  each  trajectory) .  Conversely,  if  the  total  an¬ 
gular  momentum  is  constant  along  each  trajectory,  then  V  satisfies  (2.23). 

The  proof  of  this  result  is  similar  to  that  of  Proposition  2.18  and  is  left 
as  an  exercise  (Exercise  10).  We  will  re-examine  the  concept  of  angular 
momentum  in  the  next  section  using  the  language  of  Poisson  brackets  and 
Hamiltonian  flows. 


2.5  Poisson  Brackets  and  Hamiltonian  Mechanics 


We  consider  now  the  Hamiltonian  approach  to  classical  mechanics.  (There 
is  also  the  Lagrangian  approach,  but  that  approach  is  not  as  relevant  for 
our  purposes.)  The  Hamiltonian  approach,  and  in  particular  the  Poisson 
bracket,  will  help  us  to  understand  the  general  phenomenon  of  conserved 
quantities.  The  Poisson  bracket  is  also  an  important  source  of  motivation 
for  the  use  of  commutators  in  quantum  mechanics. 

In  the  Hamiltonian  approach  to  mechanics,  we  think  of  the  energy  func¬ 
tion  as  a  function  of  position  and  momentum,  rather  than  position  and 
velocity,  and  we  refer  to  it  as  the  “Hamiltonian.”  If  a  particle  in  Mn  has 
the  usual  sort  of  energy  function  (kinetic  energy  plus  potential  energy) ,  we 
have 


ff(x,p) 


1 

2m 


n 


J2p2j +F(x) 


j— 1 


(2.24) 
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Here,  as  usual,  pj  =  rrijXj.  We  now  observe  that  Newton’s  law  can  be 
expressed  in  the  following  form: 


dxj  dH 
dt  dpj 
dpj  _  dH 
dt  dxj 


(2.25) 


After  all,  with  H  of  the  indicated  form,  these  equations  read  dxj/dt  = 
Pj  /m ,  which  is  just  the  definition  of  pj ,  and  dpj /dt  =  —  dV / dxj  =  Fj ,  which 
is  just  Newton’s  law,  in  the  form  originally  given  by  Newton.  We  refer  to 
Newton’s  law,  in  the  form  (2.25)  as  Hamilton' s  equations. 

Although  it  is  not  obvious  at  the  moment  that  we  have  gained  anything 
by  writing  Newton’s  law  in  the  form  (2.25),  let  us  proceed  on  a  bit  further 
and  see.  Our  next  step  is  to  introduce  the  Poisson  bracket. 

Definition  2.22  Let  f  and  g  be  two  smooth  functions  on  M2n,  where  an 
element  o/M2n  is  thought  of  as  a  pair  (x,  p),  with  x  E  Mn  representing  the 
position  of  a  particle  and  p  E  Mn  representing  the  momentum  of  a  particle. 
Then  the  Poisson  bracket  of  f  and  g,  denoted  {/,  g}  ,  is  the  function  on 
M2n  given  by 


if,g}  (x,p) 


df  dg 

dxj  dpj 


df  dg  \ 

dpj  dxj  ) 


The  Poisson  bracket  has  the  following  properties. 

Proposition  2.23  For  all  smooth  functions  /,  g,  and  h  on  M2n  we  have 
the  following: 

1 ■  {f,g  +  ch}  =  {/,  g}  +  c{f,  h}  for  all  c  €  R 
2.  {g,f}  =  ~{f,g} 

3-  {f,gh}  =  {f,g}h  +  g{f,h} 

4 ■  {/,  {g,h}}  =  {{f,g},h}  +  {g,  {f,h}} 

Properties  1  and  2  of  Proposition  2.23  say  that  the  Poisson  bracket  is 
bilinear  and  skew-symmetric.  Property  3  says  that  the  operation  of  “bracket 
with  /”  satisfies  the  derivation  property  (similar  to  the  product  rule  for 
derivatives)  with  respect  to  pointwise  multiplication  of  functions,  while 
Property  4  says  that  “bracket  with  /”  satisfies  the  derivation  property 
with  respect  to  the  Poisson  bracket  itself.  Property  4  is  equivalent  to  the 
Jacobi  identity : 


{/)  {g,  h}}  +  {h,  {/,  g}}  +  {g,  {h,  /}}  =  0, 


(2.26) 
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as  may  easily  be  seen  using  the  skew-symmetry  of  the  Poisson  bracket. 
The  Jacobi  identity,  along  with  bilinearity  and  skew-symmetry,  means  that 
the  space  of  C°°  functions  on  M2n  forms  a  Lie  algebra  under  the  operation 
of  a  Poisson  bracket.  (See  Chap.  16.) 

Proof.  The  first  two  properties  of  the  Poisson  bracket  are  obvious  and  the 
third  is  an  easy  consequence  of  the  product  rule.  Let  us  think  about  what 
goes  into  proving  Property  4  by  direct  computation.  (An  alternative  proof 
is  given  in  Exercise  15.)  We  compute  that 


dg  dh  \ 

dpj  dxj  J 


n 


E 

3  = 1 


df  d  f  dg  dh 
dpj  dxj  \dxj  dpj 


dg  dh  \ 
dpj  dxj  J 


Just  the  first  term  in  the  expression  for  {/,  {p,  h}}  generates  the  following 
four  terms  (all  summed  over  j )  after  we  use  the  product  rule: 


df  d2g  dh  df  dg  d2h  df  d2g  dh 
dxj  dxjdpj  dpj  dxj  dxj  dp2  dxj  dp2  dxj 


df  dg  d2h 
dxj  dpj  dxjdpj 


(2.27) 


We  see,  then,  that  the  left-hand  side  of  (2.26)  will  have  a  total  of  24  terms, 
each  summed  over  j.  Each  term  will  have  a  single  derivative  on  two  of  the 
three  functions,  and  two  derivatives  on  the  third  function.  There  are  three 
possibilities  for  which  function  gets  two  derivatives.  Once  that  function  is 
chosen,  there  are  four  possibilities  for  which  derivatives  go  on  the  other 
two  functions,  with  the  function  that  gets  two  derivatives  getting  whatever 
derivatives  remain  (for  a  total  of  two  ^-derivatives  and  two  p-derivatives). 
That  makes  12  possible  terms.  It  is  a  tedious  but  straightforward  exercise 
to  check  that  each  of  these  12  possible  terms  occurs  twice  in  the  left-hand 
side  of  (2.26),  with  opposite  signs.  To  check  just  one  case  explicitly,  in 
computing  {/q  {/,#}},  we  will  get  a  term  like  the  second  term  in  (2.27), 
but  with  (/,  g,h)  replaced  by  (ft, /,  p): 

dh  df  d2g 
dxj  dxj  dp2 


This  term  (in  the  computation  of  {ft,  {/,  g}})  cancels  with  the  third  term 
in  (2.27)  (in  the  computation  of  {/,  {p,  ft}}).  ■ 

The  following  elementary  result  will  provide  a  helpful  analogy  to  the 
“canonical  commutation  relations”  in  quantum  mechanics. 


Proposition  2.24  The  position  and  momentum  functions  satisfy  the  fol¬ 
lowing  Poisson  bracket  relations: 

{Xj,xk}  =  0 

{Pj,Pk}  =  0 

r  Pi.:  \  —  &jk- 
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Proof.  Direct  calculation.  ■ 

One  of  the  main  reasons  for  considering  the  Poisson  bracket  is  the 
following  simple  result. 

Proposition  2.25  If  (x(£),p(£))  is  a  solution  to  Hamilton’s  equation 
(2.25),  then  for  any  smooth  function  f  on  M2n  we  have 


We  generally  write  Proposition  2.25  in  a  more  concise  form  as 


df_ 

dt 


where  the  time  derivative  is  understood  as  being  along  some  trajectory. 
Proof.  Using  the  chain  rule  and  Hamilton’s  equations,  we  have 


df_ 

dt 


dxj 

dt 


df_dpj\ 

dpj  dt  J 


df  dH 

dxj  dpj 


{. f,H }, 


as  claimed.  ■ 

Observe  that  Proposition  2.25  includes  Hamilton’s  equations  themselves 
as  special  cases,  by  taking  f(x,p)  =  Xj  and  by  taking  f(x,p)  =  pj.  Thus, 
this  proposition  gives  a  more  coordinate-independent  way  of  expressing  the 
time-evolution. 


Corollary  2.26  Call  a  smooth  function  f  on  M2n  a  conserved  quantity  if 
/(x(t),p(t))  is  independent  oft  for  each  solution  (x(t),p(t))  of  Hamilton’s 
equations.  Then  f  is  a  conserved  quantity  if  and  only  if 


=  0. 

In  particular,  the  Hamiltonian  H  is  a  conserved  quantity. 

Conserved  quantities  are  also  called  constants  of  motion.  See  Conclusion 
2.31  for  another  perspective  on  this  result.  Conserved  quantities  (when  one 
can  find  them)  are  useful  in  that  we  know  that  trajectories  must  lie  in 
the  level  surfaces  of  any  conserved  quantity.  Suppose,  for  example,  that 
we  have  a  particle  moving  in  M2  and  that  the  Hamiltonian  H  and  one 
other  independent  function  /  (such  as,  say,  the  angular  momentum)  are 
conserved  quantities.  Then,  rather  than  looking  for  trajectories  in  the  four¬ 
dimensional  phase  space,  we  look  for  them  inside  the  joint  level  sets  of  H 
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and  /  (sets  of  the  form  H(x,p)  =  a,  f(x,p)  =  6,  for  some  constants  a 
and  b).  These  joint  level  sets  are  (generically)  two-dimensional  instead  of 
four-dimensional,  so  using  the  constants  of  motion  greatly  simplifies  the 
problem — from  an  equation  in  four  variables  to  one  in  only  two  variables. 

Solving  Hamilton’s  equations  on  M2n  gives  rise  to  a  flow  on  M2n,  that  is,  a 
family  <fq  of  diffeomorphisms  of  M2n,  where  <f>t(x,  p)  is  equal  to  the  solution 
at  time  t  of  Hamilton’s  equations  with  initial  condition  (x,  p).  Since  it  is 
possible  (depending  on  the  choice  of  potential  function  V)  that  a  particle 
can  escape  to  infinity  in  finite  time,  the  maps  <f>£  are  not  necessarily  defined 
on  all  of  M2n,  but  only  on  some  open  subset  thereof.  If  does  happen  to 
be  defined  on  all  of  M2n  (for  all  t),  then  we  say  that  the  flow  is  complete. 

Theorem  2.27  (Liouville’s  Theorem)  The  flow  associated  with  Hamil¬ 
ton’s  equations,  for  an  arbitrary  Hamiltonian  function  H ,  preserves  the 
(2 n)- dimensional  volume  measure 

dx\dx2  •  •  •  dxndp\dp2  •  •  •  dpn. 


What  this  means,  more  precisely,  is  that  if  a  measurable  set  E  is  con¬ 
tained  in  the  domain  of  <£q  for  some  t  E  M,  then  the  volume  of  &t{E)  is 
equal  to  the  volume  of  E. 

Proof.  Hamilton’s  equations  may  be  written  as 


d_ 

dt 


—  — 1 

r  dH  I 

Xi 

dpi 

• 

OH 

Xn 

— 

dpn 

dH 

t-H 

dxi 

i 

i 

•  •  CO 

(2.28) 


This  means  that  Hamilton’s  Equations  describe  the  flow  along  the  vector 
field  on  M2n  appearing  on  the  right-hand  side  of  (2.28).  By  a  standard  result 
from  vector  calculus  (see,  e.g.,  Proposition  16.33  in  [29]),  this  flow  will  be 
volume-preserving  if  and  only  if  the  divergence  of  the  vector  field  is  zero. 
We  compute  this  divergence  as 


d  dH  d  dH  d  dH 

- _l_  . . .  _j_ - — - 

dx\  dpi  dxn  dpn  dpi  dx i 


d  dH 
dpn  dxn ' 


(2.29) 


Since 


d2H  _  d2H 

dxjdpj  dpjdxj  ’ 


the  divergence  is  zero.  ■ 

The  existence  of  an  invariant  volume  has  important  consequences  for 
the  dynamics  of  a  system.  For  example,  for  “confined”  systems,  an  invari¬ 
ant  volume  implies  that  the  system  exhibits  “recurrence,”  which  means 
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(roughly)  that  for  most  initial  conditions,  the  particle  will  eventually  come 
back  arbitrarily  close  to  its  initial  state  in  phase  space.  We  will  not,  how¬ 
ever,  delve  into  this  aspect  of  the  theory. 

Note  that  the  divergence  of  Xh,  computed  in  (2.29),  vanishes  in  a  very 
particular  way,  namely  the  sum  of  the  jth  and  (n  +  j) th  terms  vanishes 
for  all  1  <  j  <  n.  This  stronger  condition  turns  out  to  be  equivalent  to 
the  condition  that  the  Hamiltonian  flow  associated  with  an  arbitrary 
smooth  function  on  M2n  preserves  the  symplectic  form  ca,  defined  by 

w((x>  p)>  (x'>  pO)  =  x  •  p'  -  P  •  x'. 

What  this  means,  more  precisely,  is  that  for  any  f  G  M  and  any  (x,  p)  E  M2n, 
the  matrix  of  partial  derivatives  of  at  the  point  (x,  p) — thought  of  as  a 
linear  map  of  M2n  to  M2n — preserves  uo.  This  property  of  as  it  turns  out, 
is  equivalent  to  the  property  that  preserves  Poisson  brackets,  meaning 
that 

for  all  /,  g  E  C°°(Mn).  A  map  T  :  M2n  M2n  that  preserves  uo  is  called 
a  symplectomorphism  (in  mathematics  notation)  or  a  canonical  transfor¬ 
mation  (in  physics  notation).  We  defer  the  proofs  of  these  claims  until 
Chap.  21,  where  we  can  consider  them  in  a  more  general  setting. 

Definition  2.28  For  any  smooth  function  f  on  M2n,  the  Hamiltonian 
flow  generated  by  f  is  the  flow  obtained  by  solving  Hamilton’s  equation  (2.25) 
with  the  Hamiltonian  H  replaced  by  f.  The  function  f  is  called  the  Hamil¬ 
tonian  generator  of  the  associated  flow. 

Although  any  smooth  function  on  M2n  can  be  inserted  into  Hamilton’s 
equations  to  produce  a  flow,  physically  one  should  think  that  there  is  a 
distinguished  function,  the  Hamiltonian  H  of  the  system,  such  that  the 
flow  generated  by  H  is  the  time-evolution  of  the  system.  For  any  other 
function  /,  the  Hamiltonian  flow  generated  by  /  should  not  be  thought 
of  as  time-evolution,  but  as  some  other  flow,  which  might,  for  example, 
represent  some  family  of  symmetries  of  our  system. 

Proposition  2.29  The  Hamiltonian  flow  generated  by  the  function 

/a(x,  p)  :=  a  •  p  (2.30) 

is  given  by 

x(t)  =  x0  Ft  a 

p(t)  =  po,  (2.31) 

and  the  Hamiltonian  flow  generated  by  the  function 


flb(x,p)  :=  b  •  x 


(2.32) 
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is  given  by 


x(f)  =  Xq 

p(i)  =  Po  -  tb. 


Proof.  Direct  calculation.  ■ 

What  this  means  is  that  the  Hamiltonian  flow  generated  by  a  linear 
combination  of  the  momentum  functions  consists  of  translations  in  position 
of  the  particle.  That  is  to  say,  in  the  flow  (2.31)  generated  by  the  function 
/a  in  (2.30),  the  particle’s  initial  position  xo  is  translated  by  ta.  while  the 
particle’s  momentum  is  independent  of  t.  Similarly,  the  Hamiltonian  flow 
generated  by  a  linear  combination  of  the  position  functions  [the  function 
gb  in  (2.32)]  consists  of  translations  in  the  particle’s  momentum. 

Proposition  2.30  For  a  particle  moving  in  M2,  the  Hamiltonian  flow  gen¬ 
erated  by  the  angular  momentum  function 


J (x,  p)  =  Xip2  ~X2PI 


consists  of  simultaneous  rotations  of  x  and  p.  That  is  to  say, 


Xi(t) 

COS  t 

—  sin  t 

£i(0) 

X2  (t) 

sin  t 

cos  t 

_  X2(0) 

Pl(t) 

cos  t 

—  sin  t 

Pi(0) 

P2(t) 

sin  t 

cos  t 

P2  (0) 

(2.33) 


Proof.  If  we  plug  the  angular  momentum  function  J  into  Hamilton’s  equa¬ 
tions  in  place  of  H ,  we  obtain 


dx  i 

dJ 

dp  i 

dJ 

dt 

op  1 

dt 

~  ~  V2 

OX  i 

dx  2 

dJ 

dp2 

dJ 

dt 

-  ~  ~  Xl’ 

OP2 

dt 

f)  ~P1 

OX  2 

The  solution  to  this  system  is  given  by  the  expression  in  the  proposition, 
as  is  easily  verified  by  differentiation  of  (2.33).  ■ 

Note  that  since  the  Hamiltonian  flow  generated  by  J  does  not  have  the 
interpretation  of  the  time-evolution  of  the  particle,  the  parameter  t  in  (2.33) 
should  not  be  interpreted  as  the  physical  time;  it  is  just  the  parameter  in  a 
one-parameter  group  of  diffeomorphisms.  In  this  case,  t  is  the  angle  of  rota¬ 
tion.  Thus,  one  answer  to  the  question,  “What  is  the  angular  momentum?” 
is  that  J  is  the  Hamiltonian  generator  of  rotations. 

If  /  is  any  smooth  function,  then  by  the  proof  of  Proposition  2.25,  the 
time  derivative  of  any  other  function  g  along  the  Hamiltonian  flow  gener¬ 
ated  by  /  is  given  by  dg/dt  =  {#,/}.  In  particular,  the  derivative  of  the 
Hamiltonian  H  along  the  flow  generated  by  /  is  {H,  /}.  Thus,  /  is  constant 
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along  the  flow  generated  by  H  if  and  only  if  {/,  H}  =  0,  which  holds  if  and 
only  if  {/,  H}  =  0,  which  holds  if  and  only  if  H  is  constant  along  the  flow 
generated  by  /.  This  line  of  reasoning  leads  to  the  following  result. 

Conclusion  2.31  A  function  f  is  a  conserved  quantity  for  solutions  of 
Hamilton’s  equation  (2.25)  if  and  only  if  H  is  invariant  under  the  Hamil¬ 
tonian  flow  generated  by  f.  In  particular ,  the  angular  momentum  J  is  con¬ 
served  if  and  only  if  H  is  invariant  under  simultaneous  rotations  o/x  and  p. 


We  will  return  to  this  way  of  thinking  about  conserved  quantities  in 
Chap.  21.  Compare  Exercise  12. 

The  Hamiltonian  framework  can  be  extended  in  a  straightforward  way 
to  systems  of  particles. 


Proposition  2.32  Consider  the  phase  space  for  a  system  of  N  particles 
moving  in  Mn,  namely  M2nAr,  thought  of  as  the  set  of  (2 N) -tuples  of  the 
form 


with  yf  and  pJ  belonging  to  Mn.  Define  the  Poisson  bracket  of  two  smooth 
functions  f  and  g  on  the  phase  space  by 


N  n 


3  =  1  k=l 


df  dg  df  dg 


\  dx'l  dpi  dp3k  dxl 


k 


and  consider  a  Hamiltonian  function  of  the  form 


N 


X~,p' 


,p">  =  E 


1 


j=  1 


2  m 


j 


Then  Newton’s  law  in  the  form  mjx.J  =  —  VW  is  equivalent  to  Hamilton’s 
equations  in  the  form 

dx{  _  dH 
dt  dp{ 

dpi  dH 

dt  dxi 

For  any  smooth  function  /,  the  derivative  of  f  along  a  solution  of  Hamil¬ 
ton’s  equations  is  given  by 


(2.34) 


df_ 

dt 


The  proof  of  these  results  is  entirely  similar  to  the  one-particle  case  and 
is  omitted. 
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2.6  The  Kepler  Problem  and  the  Runge-Lenz 
Vector 


2. 6. 1  The  Kepler  Problem 

We  consider  now  the  classical  Kepler  problem,  that  of  finding  the 
trajectories  of  a  planet  orbiting  the  sun.  Since  the  sun  is  very  much  more 
massive  than  any  of  the  planets,  we  may  consider  the  position  of  the  sun 
to  be  fixed  at  the  origin  of  our  coordinate  system.  The  sun  exerts  a  force 
on  a  planet  given  by 

F  =  -fc -^3.  (2.35) 

x 


Here  k  =  GmM ,  where  m  is  the  mass  of  the  planet,  M  is  the  mass  of  the 
sun,  and  G  is  the  universal  gravitational  constant.  Note  that  the  magnitude 
of  F  is  proportional  to  the  reciprocal  of  the  square  of  the  distance  from  the 
origin;  thus,  the  force  follows  an  inverse  square  law.  Since  k  contains  a 
factor  of  the  mass  m  of  the  planet,  this  quantity  drops  out  of  the  equation 
of  motion,  mx  =  F.  The  potential  associated  with  the  force  (2.35)  is  easily 
seen  to  be 


V(x) 


(2.36) 


Since  our  potential  V  is  invariant  under  rotations,  the  angular  momentum 
vector  J  =  x  x  p  is  a  conserved  quantity  (Theorem  2.21  with  N  =  1  and 
n  =  3).  If  J  =  0,  the  particle  is  moving  along  a  ray  through  the  origin. 
In  that  case,  either  the  particle  will  pass  through  the  origin  at  some  point 
in  the  future  (if  the  initial  momentum  points  toward  the  origin),  or  else 
the  particle  must  have  passed  through  the  origin  at  some  point  in  the  past 
(if  the  initial  momentum  points  away  from  the  origin).  Trajectories  of  this 
sort  are  called  collision  trajectories ,  and  we  will  regard  such  trajectories  as 
pathological. 

We  will,  from  now  on,  consider  only  trajectories  along  which  the  angular 
momentum  vector  is  nonzero.  Fixing  the  energy  and  angular  momentum  of 
the  particle  guarantees  that  the  particle  stays  a  certain  minimum  distance 
from  the  origin  (Exercise  20).  Meanwhile,  since  J  =  x  X  p,  the  position 
x(t)  of  the  particle  will  always  be  perpendicular  to  the  constant  value  of  J. 
We  will  therefore  refer  to  the  plane  (through  the  origin)  perpendicular  to 
J  as  the  “plane  of  motion.” 


2.6.2  Conservation  of  the  Runge-Lenz  Vector 

We  are  going  to  obtain  a  description  of  the  classical  trajectories  in  an 
indirect  way,  using  something  called  the  Runge-Lenz  vector. 
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Definition  2.33  The  Runge-Lenz  vector  is  the  vector-valued  function 
on  M3\{0}  x  M3  given  by 

A(x,p)  =  -^—p  x  J  -  — 
mk  x 

Here  x  represents  the  position  of  a  classical  particle  and  p  its  momentum. 

The  significance  of  this  vector  is  that  it  is  a  conserved  quantity  for  the 
Kepler  problem.  Of  course,  whenever  the  potential  energy  is  radial  (a  func¬ 
tion  of  the  distance  from  the  origin),  the  angular  momentum  vector  is  a 
conserved  quantity.  What  is  special  about  the  1/r  potential  of  the  Kepler 
problem  is  that  there  is  another  conserved  vector- valued  quantity. 

Proposition  2.34  The  Runge-Lenz  vector  is  conserved  quantity  for  New¬ 
ton’s  law  with  force  given  by  (2.35). 

Proof.  Since  J  is  conserved,  we  compute  that 


A(t)  = 


1  1  p  x 

F  X  J - —  +  - ; 


mk 


x 


m 


x 


Ed  lx  dxj 


3  = 1 


dxn  dt 


11  /  \ 

- oX  x  (x  x  p) 


1  p  X 


m 


X 


X 


- b 

m 


x 


Xj  Pj 


3  = 1 


x  m 


1 

m 

0. 


x 


1  f  A  ,  1  /  \  P  ,  X(x-p) 
—  x(x  •  p)  +  3  p(x  •  x) +  - 


X 


X 


X 


Here  we  have  used  the  identity  b  x  (c  x  d)  =  c(b  •  d)  —  d(b  •  c),  which  holds 
for  all  vectors  b,  c,  d  G  M3.  ■ 


2.6.3  Ellipses ,  Hyperbolas ,  and  Parabolas 

We  now  use  the  Runge-Lenz  vector  to  determine  the  trajectories  for  the 
Kepler  problem. 


Proposition  2.35  The  magnitude  of  the  Runge-Lenz  vector  A  satisfies 

2 


A  =  1  + 


2  |  J  | 
mk2 


E. 


2 

where  E  =  |p|  /(2m)  —  k/  |x|  is  the  energy  of  the  particle.  Furthermore, 
if  St  :=  x/  |x|  is  the  unit  vector  in  the  yi-direction,  we  have 


J 

2 

mk 

X 

A-x 


1 


(2.37) 
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for  all  nonzero  x.  It  follows  from  (2.37)  that 


x 


mk(  1  +  A  •  x) 


Note  that  from  (2.37),  A  •  x  >  —1  for  all  points  (x,  p)  with  x/0, 
Proof.  Using  the  identity  b  •  (c  x  d)  =  d  •  (b  x  c),  we  see  that 


x  •  (p  X  J)  =  J  •  (x  x  p)  =  |  J 


X 


Since  J  and  p  are  orthogonal,  we  get 


1 


nr 


=  1  + 


U2  ^ 
2  I J I 


+  1 

2 


mk 


■x 


(p  X  J) 


mk 2  \  2 to 


k 


x 


1  2  |  J  | 

" 1  + 

Using  again  the  identity  for  b  •  (c  x  d),  we  next  compute  that 


A  •  x  = 


1 


mk 


J  •  (x  x  p)  — 


X  •  X 


X 


mk 


x 


We  may  now  divide  by  |x|  to  obtain  the  desired  expression  for  A  x.  It  is 
then  straightforward  to  solve  for  |x|  .  ■ 

Corollary  2.36  Choose  orthonormal  coordinates  in  the  plane  of  motion 
so  that  A  lies  along  the  positive  x\-axis.  If  r  and  0  are  the  polar  coor¬ 
dinates  associated  with  this  coordinate  system,  then  along  each  trajectory 
( r(t ),  0(f)),  we  have 


r(t)  = 


1 


mk  1  +  A  cos  6(f)  ’ 


(2.38) 


where  A  =  A 


If  A  =  0,  any  orthonormal  coordinates  can  be  used. 

Proposition  2.37  If  A  :=  |A  <  1,  (2.38)  is  the  equation  of  an  ellipse  with 
eccentricity  A  and  with  the  origin  being  one  focus  of  the  ellipse.  If  A  >  1, 
(2.38)  is  the  equation  of  a  hyperbola,  and  if  A  =  1,  (2.38)  is  the  equation 
of  a  parabola. 

The  orbit  of  the  particle  in  the  plane  of  motion  is  an  ellipse  if  the  energy 
of  the  particle  is  negative,  a  hyperbola  if  the  energy  is  positive,  and  a 
parabola  if  the  energy  is  zero. 
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FIGURE  2.2.  Elliptical  orbit  for  the  Kepler  problem,  with  two  equal  areas  shaded. 


Kepler’s  first  law  is  the  assertion  that  planets  move  in  elliptical  trajec¬ 
tories  with  the  sun  at  one  focus,  as  shown  in  Fig.  2.2.  The  shaded  regions 
indicate  two  equal  areas  that  are  swept  out  in  equal  times,  in  accordance 
with  Kepler’s  second  law  (Corollary  2.19). 

Recall  that  the  eccentricity  of  an  ellipse  is  yjl  —  (6/a)2,  where  a  is  half 
the  length  of  the  major  axis  and  b  is  half  the  length  of  the  minor  axis. 
Thus,  when  A  =  0,  we  have  b  =  a,  meaning  that  the  ellipse  is  a  circle. 
Proof.  We  continue  to  work  in  a  coordinate  system  in  which  A  is  along 
the  positive  aq-axis.  Then  (2.38)  becomes 


\Ac2  +  y2 


1 


a 


1  +  A 


X  ’ 


V  U-hr 


2 

where  a  =  |J|  /(mk).  From  this  we  obtain 


1  =  —  ( \J x2  Ay2  +  Ax 
a  V 


Now  we  can  solve  for  ^  x2  +  y2,  square  both  sides  of  the  equation,  and 
simplify.  Assuming  A2  ^  1,  we  obtain 


1 


a 


1  -A2 


(1  -  A2)  I  x  A 


Aa 


1  -A2 


+  y2- 


(2.39) 


This  is  the  equation  of  an  ellipse  (if  A2  <  1)  or  a  hyperbola  (if  A2  >  1), 
where  the  center  of  the  ellipse  or  hyperbola  is  the  point  (-a/il-A^O). 
In  light  of  the  formula  for  A  :=  |  A|  in  Proposition  2.35,  we  obtain  an  ellipse 
if  the  energy  of  the  particle  is  negative  and  a  hyperbola  if  the  energy  is 
positive. 

In  the  case  A2  <  1,  we  may  readily  compute  the  half-lengths  a  and  b  of 
the  major  and  minor  axes  as 

a  a 

a  =  - — ;  b  =  ,  =. 

1  -  A2  ’  VT^a2 

From  this,  we  readily  calculate  that  the  eccentricity  is  A.  Now,  the  distance 
between  the  foci  of  an  ellipse  is  the  length  of  the  major  axis  times  the 
eccentricity,  in  our  case,  2Aa/(l  —  A2).  Since  the  center  of  the  ellipse  in 
(2.39)  is  at  the  point  ( Aa / (1  —  A2),  0),  the  origin  is  one  focus  of  the  ellipse. 
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If  A 2  =  1,  then  when  we  perform  the  same  analysis,  x 2  drops  out  of  the 
equation  and  we  obtain 


1 


x  = 


2  Aol 


(V  +  a2) 


which  is  the  equation  of  a  parabola  opening  along  the  x-axis.  This  case 
corresponds  to  energy  zero.  ■ 

Note  that  Proposition  2.37  does  not  tell  us  how  the  particle  moves  along 
the  ellipse,  hyperbola,  or  parabola  as  a  function  of  time.  We  can,  however, 
determine  this,  at  least  in  principle,  by  making  use  of  the  angular  momen¬ 
tum.  After  all,  applying  (2.17)  in  the  plane  of  motion  gives 


dO  _  1 

dt  mr2 


(2.40) 


where  0  is  the  polar  angle  variable  in  the  plane  of  motion.  Since  we  have 
computed  r  as  a  function  of  0  in  Corollary  2.36,  (2.40)  gives  us  a  (first- 
order,  separable)  differential  equation,  from  which  we  can  attempt  to  solve 
to  obtain  0 — and  thus  also  r — as  a  function  of  t. 


2.6.4  Special  Properties  of  the  Kepler  Problem 

As  we  have  said,  the  existence  of  another  conserved  vector- valued  function- 
in  addition  to  the  conserved  energy  and  angular  momentum — is  special  to 
a  potential  of  the  form  —k/  |x|  .  For  a  general  radial  potential,  the  energy 
and  the  angular  momentum  will  be  the  only  conserved  quantities.  Assuming 
J  0,  the  motion  of  a  particle  in  any  radial  potential  will  always  lie  in  the 
plane  perpendicular  to  J.  Taking  this  into  account,  we  think  of  our  particle 


FIGURE  2.3.  Trajectory  in  the  plane  of  motion  for  a  typical  radial  potential. 
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as  moving  in  M2  rather  than  M3,  and  accordingly  think  of  our  phase  space 
as  being  four-dimensional  rather  than  six-dimensional.  From  this  point  of 
view,  there  are  two  remaining  conserved  quantities,  the  energy  E  and  the 
scalar  angular  momentum  J  in  the  plane,  as  given  by  Definition  2.17.  Thus, 
each  trajectory  will  lie  in  a  set  of  the  form 

{  (x,  p)  gM2  x  M2 1  E(x,  p)  =  a,  J(x,  p)  =  6}  . 

We  refer  to  such  a  set  as  a  joint  level  set  of  E  and  J.  These  sets  are  two- 
dimensional  surfaces  inside  our  four-dimensional  phase  space. 

For  a  general  radial  potential,  a  trajectory  (x(t),p(t))  in  phase  space 
may  not  be  a  closed  curve,  but  may  fill  up  a  dense  subset  of  the  joint 
level  surface  on  which  it  lives.  In  particular,  the  trajectory  x(t)  in  position 
space  will  typically  not  be  a  closed  curve.  For  example,  x(t)  may  trace  out 
a  roughly  elliptical  region  in  the  plane,  but  where  the  axes  of  the  ellipse 
“precess,”  that  is,  vary  with  time.  Such  a  trajectory  is  shown  in  Fig.  2.3, 
which  should  be  contrasted  with  Fig.  2.2. 

In  the  Kepler  problem,  even  after  restricting  attention  to  the  plane  of 
motion,  we  still  have  one  conserved  quantity  in  addition  to  E  and  J,  namely 
the  direction  of  A,  which  can  be  expressed  in  terms  of  the  angle  <fi  between 
A  and  the  aq-axis  in  the  plane  of  motion.  (Note  that  both  terms  in  the 
definition  of  A  lie  in  the  plane  of  motion.  Note  also  that  the  magnitude  of  A 
is,  by  Proposition  2.35,  computable  in  terms  of  E  and  J.)  The  trajectories 
of  the  Kepler  problem,  then,  lie  in  the  joint  level  sets  of  E  and  J  and  </>, 
which  are  one-dimensional.  When  E  <  0,  the  joint  level  sets  of  E  and  J  are 
compact,  in  which  case  the  joint  level  sets  of  E  and  J  and  are  compact 
and  one-dimensional,  that  is,  simple  closed  curves. 

Another  special  property  of  the  Kepler  problem  is  that  the  period  of  the 
closed  trajectories  (the  trajectories  with  negative  energy)  is  the  same  for  all 
trajectories  with  the  same  energy  (Exercise  21).  This  apparent  coincidence 
can  be  explained  by  showing  that  the  Hamiltonian  flows  (Definition  2.28) 
generated  by  J  and  A  act  transitively  on  the  energy  surfaces.  These  flows 
commute  with  the  time  evolution  of  the  system,  because  they  are  all  con¬ 
served  quantities  (Conclusion  2.31).  Thus,  any  two  points  with  the  same 
energy  are  “equivalent”  with  respect  to  time  evolution.  Although  we  will 
not  go  into  the  details  of  this  analysis,  we  will  gain  a  better  understanding 
of  the  flows  generated  by  the  components  of  A  in  Sect.  18.4. 


2.7  Exercises 

1.  Consider  a  particle  moving  in  the  real  line  in  the  presence  of  a  force 
coming  from  a  potential  function  V.  Given  some  value  Eq  for  the 
energy  of  the  particle,  suppose  that  V(x)  <  Eq  for  all  x  in  some 
closed  interval  [xq,xi\.  Then  a  particle  with  initial  position  xq  and 
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positive  initial  velocity  will  continue  to  move  to  the  right  until  it 
reaches  x\.  Using  (2.6),  show  that  the  time  needed  to  travel  from  xq 
to  x  i  is  given  by 


* = £'  v£7  -  v-w) 

Note :  This  shows  that  we  can  solve  Newton’s  equation  in  M1  more 
or  less  explicitly  for  time  as  a  function  of  position,  which  in  principle 
determines  the  position  as  a  function  of  time. 

2.  In  the  notation  of  the  previous  problem,  suppose  now  that  V (x)  <  Eq 
for  xq  <  x  <  x\,  but  that  V{pc\)  =  Eq. 

(a)  Show  that  if  V'(x\)  ^  0,  then  the  particle  reaches  x\  in  a  finite 
time. 

(b)  Show  that  if  V ’  (xi)  =  0,  then  the  time  it  takes  the  particle  to 
reach  x\  is  infinite;  that  is,  the  particle  approaches  but  never 
actual  reaches  x\. 


Note :  In  Part  (b),  the  point  x\  is  an  unstable  equilibrium  for  the 
system,  that  is,  a  critical  point  for  V  that  is  not  a  local  minimum. 


3.  Consider  the  equation  of  motion  of  a  pendulum  of  length  L, 


d26 
dt 2 


+  Y  sin  0  =  0 

Jj 


•> 


where  g  is  the  acceleration  of  gravity.  Here  0  is  the  angle  between  the 
pendulum  and  the  negative  y- axis  in  the  plane.  This  system  has  a 
stable  equilibrium  at  0  =  0  and  an  unstable  equilibrium  at  0  =  it. 

Consider  initial  conditions  of  the  form  0(0)  =  tt  —  <5,  0(0)  =  0,  for 
0  <  S  <  7r/4.  Fix  some  angle  0q  and  let  T(S)  denote  the  time  it  takes 
for  the  pendulum  with  the  given  initial  conditions  to  reach  the  angle 
0o-  (Here  0o  represents  an  arbitrarily  chosen  cutoff  point  at  which  the 
pendulum  is  no  longer  “close”  to  0  =  tt.)  Show  that  T(S)  grows  only 
logarithmically  as  S  tends  to  zero. 

Note:  Logarithmic  growth  of  T  as  a  function  of  S  corresponds  to 
exponential  decay  of  S  as  a  function  of  T.  Thus,  if  we  want  T  to  be 
large,  we  must  choose  8  to  be  very  small. 


4.  Consider  a  particle  moving  in  the  real  line  in  the  presence  of  a 
“repelling  potential,”  such  that  there  is  an  A  with  V'{x)  <  0  for 
all  x  >  A.  Then  a  particle  with  initial  position  Xq  >  A  and  positive 
initial  velocity  will  have  positive  velocity  for  all  positive  times.  Sup¬ 
pose  now  that  V(x)  =  —xa  for  all  x  >  1,  for  some  positive  constant 
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a.  Suppose  also  that  the  particle  is  given  initial  position  xo  >  1  and 
positive  initial  velocity.  Show  that  for  a  >  2,  the  particle  escapes  to 
infinity  in  finite  time,  but  that  for  a  <2,  the  position  of  the  particle 
remains  finite  for  all  finite  times. 

Hint :  Use  Problem  1. 


5.  Consider  the  equation  mx  +  jx  +  kx  =  0,  where  7  and  k  are  positive 
constants  (the  damping  constant  and  spring  constant,  respectively). 
Find  the  critical  value  yc  of  7  (for  a  fixed  m  and  k)  such  that  for 
7  <  7c?  we  get  solutions  that  are  sines  and  cosines  times  a  decaying 
exponential  and  for  7  >  yc,  we  get  pure  decaying  exponentials. 

6.  Continue  with  the  notation  of  Exercise  5.  Given  particular  choices 
for  m,  7,  and  fc,  let  r  be  the  rate  of  exponential  decay  of  a  “generic” 
solution  to  the  equation  of  motion.  Here,  if  the  solution  is  of  the  form 
ae~rt  cos (ut)  +  be~rt  sin (cj£),  the  rate  of  exponential  decay  is  r.  If  the 
solution  is  of  the  form  ae~rit  +  be~r2t ,  then  r  =  min(r  1,7*2),  since 
the  slower-decaying  term  will  dominate  as  long  as  a  and  b  are  both 
nonzero. 

For  a  fixed  value  of  m  and  fc,  show  that  the  maximum  value  for  r 
is  achieved  by  taking  7  =  yc.  (This  accounts  for  the  terminology 
“critical  damping”  for  the  case  in  which  7  =  7C.) 

7.  Consider  the  M2 -valued  function  F  on  I2  \  {0}  given  by 


F(xi,x2) 


x2 


X\ 


x- 


+  x\ 


rf*  ^  I  ry*  * 
tAy  1  I  ( 


Show  that  dFi/dx2  —  dF2/dx\  =  0  but  that  there  does  not  exist  any 
smooth  function  V  on  M2  \  {0}  with  F  =  —  VU. 

Hint :  If  F  were  of  the  form  —  W,  we  would  have 


r(x(6))  -  V(x(a)) 


6  /  n  dx  7 

F(x(t»  ■  -  dt 


for  every  smooth  path  x(-)  :  [a,  b]  R2  \{0},  by  the  fundamental 

theorem  of  calculus  and  the  chain  rule. 


8. 


Consider  a  particle  moving  in  Mn  with  a  velocity-dependent  force  law 
given  by 

F(x,  v)  =  —  W(x)  +  F2(x,  v), 


where  the  velocity-dependent  term  F2  acts  perpendicularly  to  the 
velocity  of  the  particle.  (That  is,  we  assume  that  v  •  F2(x,  v)  =  0 
for  all  x  and  v.)  Let  E  denote  the  usual  energy  function  E(x,  v)  = 
2  -f-U(x),  unmodified  by  the  presence  of  the  velocity-dependent 


\m 


V 


term  in  the  force.  Show  that  E  is  conserved. 
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9.  (a)  If  r  and  0  are  the  usual  polar  coordinates  on  M2,  compute  dO/dxi 

and  d0/dx2- 

(b)  If  x(-)  denotes  the  trajectory  of  a  particle  of  mass  m  moving  in 
M2,  show  that 

^(X(0)  =  ^2^(x(t),p(t)). 

10.  Prove  Theorem  2.21,  by  imitating  the  proof  of  Proposition  2.18.  You 
may  assume  that  every  rotation  can  be  built  up  as  a  product  of 
repeated  rotations  in  the  various  coordinate  planes  (i.e.,  rotations  in 
the  (xj,Xk)  plane,  for  various  pairs  (j,  fc),  where  the  same  plane  may 
be  used  more  than  once). 

11.  Consider  Hamilton’s  equations  for  N  particles  moving  in  Mn,  as  in 

Proposition  2.32.  Show  that  the  total  momentum  p  =  PJ  of  the 

system  is  a  conserved  quantity  if  and  only  if  the  quantity 

H^x1  +  a, . . . ,  xN  +  a,  p1  +  a, . . . ,  p^  +  a),  a  E  Mn, 

is  independent  of  a  for  all  x1, ... ,  xN  and  p1, . . . ,  p^  in  Mn. 

Hint :  Use  (the  TV-particle  version  of)  Conclusion  2.31. 

12.  Let  J  denote  the  angular  momentum  of  a  particle  moving  in  M2. 
Let  Rq  denote  a  counterclockwise  rotation  by  angle  6  in  M2. 

(a)  If  /  is  any  smooth  function  on  M4,  show  that 

{f,J}  (x,p)  =  (Rex,  Rep) 

at)  0=o 

(b)  Let  H  be  any  smooth  function  on  M4  and  consider  Hamilton’s 
equations  with  this  function  playing  the  role  of  the  Hamilto¬ 
nian.  Show  that  J  is  conserved  (i.e.,  constant  in  time  along  any 
solution  of  Hamilton’s  equations)  if  and  only  if 

H(Rex,  Re p)  =  H (x,  p) 

for  all  6  in  R  and  all  x  and  p  in  M2.  (This  argument  is  a  more 
explicit  way  to  obtain  Conclusion  2.31.) 

13.  Suppose  that  /  and  g  are  smooth  functions  on  M2n  and  that  at  least 
one  of  the  two  functions  has  compact  support.  Show  that 

[  [  {/>0}(x,p)  dnx  dnp  =  0. 

J Rn  J Rn 

Hint :  Use  integration  by  parts  or  Liouville’s  theorem. 
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14.  Let  X  and  Y  be  “vector  fields”  on  Mn,  viewed  as  first-order  differential 
operators.  This  means  that  X  and  Y  are  of  the  form 

n  O  n  rj 

=  S'  = 

.7  =  1  J  .7  =  1  J 

[If  X(x)  =  (ai(x), . . . ,  an(x)),  then  the  operator  X  is  the  directional 
derivative  in  the  direction  of  X.  It  is  common  to  identify  the  vector¬ 
valued  function  X  with  the  associated  first-order  differential  operator 
X.] 

Show  that  the  commutator  [X,  Y]  of  X  and  Y,  defined  by 

[X,  Y]  =  XY  —  YX 

is  again  a  vector  field  (i.e.,  a  first- order  differential  operator). 

15.  Given  a  smooth  function  /  on  M2n,  define  an  operator  X/,  acting  on 
C°° (M2n),  by  the  formula 


Xf(g)  =  {f,g}. 


That  is  to  say, 


dpj  dxj  ) 


The  operator  Xf  is  called  the  Hamiltonian  vector  field  associated 
with  the  function  /.  (Here,  as  in  Exercise  14,  we  identify  vector  fields 
with  first-order  differential  operators.) 


(a)  Show  that  for  all  /,  g  E  C°°(M2n),  we  have 


XVM  =  IWW], 

where  [Xf,Xg]=XfXg-XgXf. 

Hint :  By  Exercise  14,  all  terms  in  the  computation  of  [Xj,  Xg\(h) 
involving  second  derivatives  of  h  can  be  neglected,  since  they  will 
always  cancel  out  to  zero. 

(b)  Use  Part  (a)  to  compute  {{/, g},  h }  =  X^^{h)  and  thereby  ob¬ 
tain  another  proof  of  the  Jacobi  identity  for  the  Poisson  bracket. 


16.  Recall  the  definition  of  a  Hamiltonian  vector  field  Xf  in  Exercise  15. 

(a)  Consider  a  smooth  vector  field  X  on  M2  (viewed  as  a  first-order 
differential  operator  as  in  Exercise  14)  of  the  form 


V(x) 


9i(x,p) 


d_ 

dx 


+  9i{x,p) 
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Show  that  X  can  be  expressed  as  X  =  Xf,  for  some  /  E 
C°°(M2),  if  and  only  X  is  divergence  free ,  that  is,  if  and  only 

V-X:=^  +  ^=0. 

ox  op 


Hint :  As  in  Proposition  2.7,  given  a  pair  of  functions  hi  and 
on  M2,  there  exists  a  function  /  with  df/dx  =  hi  and  df  /dp  = 
^2  if  and  only  if  we  have  dhi/dp  =  dh^jdx. 

(b)  Show  that  there  exists  a  smooth  vector  field  X  on  M4  of  the  form 


such  that 

^■x'-=il{jr  +  a7r1)= 0 

yt  \dx3  dPj  ) 

but  such  that  there  does  not  exist  /  €  C°°  ('Ft 1 )  with  X  =  Xf. 
Hint :  You  should  be  able  to  find  a  counterexample  in  which  the 
coefficient  functions  gj  are  linear. 

17.  Show  that  the  space  of  homogeneous  polynomials  of  degree  2  on  M2n 
is  closed  under  the  Poisson  bracket. 


18.  Determine  the  Hamiltonian  flow  on  M2  generated  by  the  function 
f(x,p)  =  xp. 

19.  Let  J  denote  the  angular  momentum  vector  for  a  particle  moving  in 
M3,  namely  J  =  x  x  p.  Show  that  the  components  Ji,  J2,  and  J3  of 
J  satisfy  the  following  Poisson  bracket  relations: 


{^1,^2}  — J3;  {^2,^3}  —  Ji;  {d3,Ji}  —  J2. 


20.  In  the  Kepler  problem,  show  that  for  each  real  number  E  and  positive 
number  J,  there  exists  e  >  0  such  that  for  all  (x,  p)  with  E?(x,  p)  =  E 
and  |J(x,  p)|  =  J,  we  have  |x|  >  e. 


Hint :  Suppose  that  (xn,pn)  is  a  sequence  with  |J(xn,pn) 
tending  to  zero.  Show  that  i^(xn,pn)  tends  to  +00. 


J  and 


x 


21.  (a)  Determine  the  area  of  the  ellipse  in  the  plane  of  motion  in  Propo¬ 

sition  2.37,  in  the  case  A  <  1. 

(b)  Show  that  the  time  T  it  takes  the  particle  to  travel  once  around 
the  ellipse  is  given  by 


V2 


GM(-E)~ 3/2 
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where  E  is  the  “massless  energy”  of  the  particle,  given  by 


rri 


GM 
lx  I 


Note  in  the  case  where  the  trajectory  in  the  plane  of  motion  is 
elliptical,  the  energy  of  the  particle  is  negative. 

Note:  The  result  of  Part  (b)  is  closely  related  to  Kepler’s  third  law. 


3 

A  First  Approach  to  Quantum 
Mechanics 


In  this  chapter,  we  try  to  understand  the  main  ideas  of  quantum  mechanics. 
In  quantum  mechanics,  the  outcome  of  a  measurement  cannot — even  in 
principle — be  predicted  beforehand;  only  the  probabilities  for  the  outcome 
of  the  measurement  can  be  predicted.  These  probabilities  are  encoded  in  a 
wave  function ,  which  is  a  function  of  a  position  variable  x  E  Mn.  The  square 
of  the  absolute  value  of  the  wave  function  encodes  the  probabilities  for  the 
position  of  the  particle.  Meanwhile,  the  probabilities  for  the  momentum  of 
the  particle  are  encoded  in  the  frequency  of  oscillation  of  the  wave  function. 
The  probabilities  can  be  described  using  the  position  operator  and  the 
momentum  operator.  The  time-evolution  of  the  wave  function  is  described 
by  the  Hamiltonian  operator ,  which  is  analogous  to  the  Hamiltonian  (or 
energy)  function  in  Hamilton’s  equations. 


3.1  Waves,  Particles,  and  Probabilities 

There  are  two  key  ingredients  to  quantum  theory,  both  of  which  arose  from 
experiments.  The  first  ingredient  is  wave-particle  duality,  in  which  objects 
are  observed  to  have  both  wavelike  and  particlelike  behavior.  Light,  for 
example,  was  thought  to  be  a  wave  throughout  much  of  the  nineteenth 
century,  but  was  observed  in  the  early  twentieth  century  to  have  parti¬ 
cle  behavior  as  well.  Electrons,  meanwhile,  were  originally  thought  to  be 
particles,  but  were  then  observed  to  have  wave  behavior. 
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The  second  ingredient  of  quantum  theory  is  its  probabilistic  behavior. 
In  the  two-slit  experiment,  for  example,  electrons  that  are  “identically 
prepared”  do  not  all  hit  the  screen  at  the  same  point.  Quantum  theory 
postulates  that  this  randomness  is  fundamental  to  the  way  nature  behaves. 
According  to  quantum  mechanics,  it  is  impossible  (theoretically,  not  just 
in  practice)  to  predict  ahead  of  time  what  the  outcome  of  an  experiment 
will  be.  The  best  that  can  be  done  is  to  predict  the  probabilities  for  the 
outcome  of  an  experiment. 

These  two  aspects  of  quantum  theory  come  together  in  the  wave  function. 
The  wave  function  is  a  function  of  a  variable  x  E  Mn,  which  we  interpret  as 
describing  the  possible  values  of  the  position  of  a  particle,  and  it  evolves  in 
time  according  to  a  wavelike  equation  (the  Schrodinger  equation).  The  wave 
function  and  its  time-evolution  account  for  the  wave  aspect  of  quantum 
theory.  The  particle  aspect  of  the  theory  comes  from  the  interpretation  of 
the  wave  function.  Although  it  is  tempting  to  interpret  the  wave  function 
as  a  sort  of  cloud,  where  we  have,  say,  a  little  bit  of  electron-cloud  over 
here,  and  little  bit  of  electron-cloud  over  there,  this  interpretation  is  not 
consistent  with  experiment.  Whenever  we  attempt  to  measure  the  position 
of  a  single  electron,  we  always  find  the  electron  at  a  single  point.  A  single 
electron  in  the  two-slit  experiment  is  observed  at  a  single  point  on  the 
screen,  not  spread  out  over  the  screen  the  way  the  wave  function  is.  The 
wave  function  does  not  describe  something  that  is  directly  observable  for  a 
single  particle;  rather,  the  wave  function  determines  the  statistical  behavior 
of  a  whole  sequence  of  identically  prepared  particles.  See  Fig.  1.4  for  a 
dramatic  experimental  demonstration  of  this  effect. 

In  the  two-slit  experiment,  for  example,  it  is  possible  to  determine  how 
the  wave  function  behaves  as  a  function  of  time  by  solving  the  (determin¬ 
istic)  Schrodinger  equation.  Knowledge  of  the  wave  function  of  an  individ¬ 
ual  electron,  however,  does  not  determine  where  that  electron  will  hit  the 
screen.  The  wave  function  merely  tells  us  the  probability  distribution  for 
where  the  electron  might  hit  the  screen,  something  that  is  only  observable 
by  shooting  a  whole  sequence  of  electrons  at  the  screen. 

It  is  an  oversimplification,  but  a  useful  one,  to  describe  the  wave-particle 
aspect  of  quantum  theory  in  this  way:  a  single  electron  (or  photon,  or 
whatever)  acts  like  a  particle,  but  a  large  collection  of  electrons  behaves 
like  a  wave.  A  single  measurement  of  a  single  electron  always  gives  its 
position  as  a  point,  just  as  we  would  expect  for  a  particle.  This  point, 
however,  varies  from  one  electron  to  the  next,  even  if  we  shoot  each  electron 
toward  the  screen  in  precisely  the  same  way.  Repeated  measurements  of 
identically  prepared  electrons  give  a  distribution  that  can,  for  example, 
exhibit  interference  patterns,  just  as  we  would  expect  for  a  wave.  See,  again, 
Fig.  1.4,  which  should  be  compared  to  Figs.  1.1  and  1.2. 

It  is  interesting  to  note  that  at  the  macroscopic  scale,  where  quantum  ef¬ 
fects  are  not  apparent,  light  appears  to  be  a  wave,  whereas  electrons  appear 
to  be  particles.  This  is  the  case  even  though  both  light  and  electrons  are 
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really  wave-particle  hybrids,  described  in  probabilistic  terms  by  a  wave 
function.  The  difference  between  the  two  situations  is  that  photons  (the  par¬ 
ticles  of  light)  have  mass  zero,  whereas  electrons  have  positive  mass.  This 
means  that  photons,  unlike  electrons,  can  easily  be  created  and  destroyed 
even  at  low  energies.  Thus,  the  discrete  aspect  of  light — namely,  that  the 
energy  in  light  comes  only  in  discrete  “quanta,”  namely  the  photons — is 
less  evident  than  the  corresponding  discrete  aspect  of  electrons. 


3.2  A  Few  Words  About  Operators 
and  Their  Adjoints 

In  quantum  mechanics,  physical  quantities — such  as  position,  momentum, 
and  energy — are  represented  by  operators  on  a  certain  Hilbert  space  H. 
These  operators  are  unbounded  operators,  reflecting  that  in  classical  me¬ 
chanics,  these  quantities  are  unbounded  functions  on  the  classical  phase 
space.  In  this  section,  we  look  briefly  at  some  technical  issues  related  to 
unbounded  operators  and  their  adjoints.  We  will  delay  a  full  discussion  of 
these  technicalities  (Chap.  9)  until  after  we  have  understood  the  basic  ideas 
of  quantum  mechanics. 

Here  and  throughout  the  book,  H  will  represent  a  Hilbert  space  over  C, 
always  assumed  to  be  separable.  We  follow  the  convention  in  the  physics 
literature  that  the  inner  product  be  linear  in  the  second  factor: 

(3,  A-i/>)  =  A  (<t>,  ip) ;  (\<fi,  ip)  =  \  (0,  ip) 

for  all  </>,  if  E  H  and  all  A  E  C. 

Recall  (Appendix  A. 3. 4)  that  a  linear  operator  A  :  H  -A  H  is  bounded 
if  there  is  a  constant  C  such  that  \\Aif\\  <  C  \\ip\\  for  all  if  E  H.  For  any 
bounded  operator  A,  there  is  a  unique  bounded  operator  A*,  called  the 
adjoint  of  A ,  such  that 


(3AA  =  WAV’) 

for  all  0,  if  E  H.  The  existence  of  A*  follows  from  the  Riesz  theorem  (Ap¬ 
pendix  A. 4. 3),  by  observing  that  for  each  fixed  0,  the  map  if  ^a  (</>,  Aif) 
is  a  bounded  linear  functional  on  H.  A  bounded  operator  is  said  to  be 
self-adjoint  if  A*  =  A. 

For  various  reasons,  both  physical  and  mathematical,  we  want  the 
operators  of  quantum  mechanics  operators  to  be  self-adjoint.  Once  one 
sees  the  formulas  for  these  operators,  however,  one  is  confronted  with  a 
serious  technical  difficulty:  the  operators  are  not  bounded. 

If  A  is  a  linear  operator  defined  on  all  of  H  and  having  the  property 
that  ((/>,  Aif)  =  (A<f:if)  for  all  </>,  if  E  H,  then  A  is  automatically  bounded. 
(See  Corollary  9.9.)  To  put  this  fact  the  other  way  around,  an  unbounded 
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self-adjoint  operator  cannot  be  defined  on  the  entire  Hilbert  space.  Thus,  to 
deal  with  the  unbounded  operators  of  quantum  mechanics,  we  must  deal 
with  operators  that  are  defined  only  on  a  subspace  of  the  relevant  Hilbert 
space,  called  the  domain  of  the  operator. 


Definition  3.1  An  unbounded  operator  A  on  H  is  a  linear  map  from 
a  dense  subspace  Dom(A)  C  H  into  H. 

More  precisely,  the  operator  A  is  “not  necessarily  bounded,”  since  noth¬ 
ing  in  the  definition  prevents  us  from  having  Dom(A)  =  H  and  having  A 
be  bounded. 

In  defining  the  adjoint  of  an  unbounded  operator,  we  immediately  en¬ 
counter  a  difficulty:  for  a  given  <f>  E  H,  the  linear  functional  (0,  A-)  may 
not  be  bounded,  in  which  case  we  cannot  use  the  Riesz  theorem  to  define 
A*(fi.  What  this  means  is  that  the  adjoint  of  A ,  like  A  itself,  will  be  defined 
not  on  all  of  H  but  only  on  some  subspace  thereof. 


Definition  3.2  For  an  unbounded  operator  A  on  H,  the  adjoint  A *  of  A 
is  defined  as  follows.  A  vector  <f  E  H  belongs  to  the  domain  Dom(A*)  of 
A *  if  the  linear  functional 

{4>,A-) , 

defined  on  Dom(A),  is  bounded.  For  <p  E  Dom(A*),  let  A*<f  be  the  unique 
vector  x  such  that 

(xV)  = 

for  all  fj  E  Dom(A). 


Saying  that  the  linear  functional  (0,  A-)  is  bounded  means  that  there  is 
a  constant  C  such  that  |(</>,  Ajj)\  <  C  H^H  for  all  E  Dom(A).  If  (0,  A-)  is 
bounded,  then  since  Dom(A)  is  dense,  the  BLT  theorem  (Theorem  A. 36) 
tells  us  that  (</>,  A-)  has  a  unique  bounded  extension  to  all  of  H.  The  Riesz 
theorem  then  guarantees  the  existence  and  uniqueness  of  y.  The  adjoint  of 
an  unbounded  linear  operator  is  a  linear  operator  on  its  domain. 

We  are  now  ready  to  define  self-adjointness  (and  some  related  notions) 
for  unbounded  operators. 


Definition  3.3  An  unbounded  operator  A  on  H  is  symmetric  if 


{4>,  Atp)  =  (A<j>,  ip) 

for  all  0,  if  E  Dom(A).  The  operator  A  is  self-adjoint  if  Dom(A*)  = 
Dom(A)  and  A*f>  =  A<f>  for  all  E  Dom(A).  Finally,  A  is  essentially 
self-adjoint  if  the  closure  in  H  x  H  of  the  graph  of  A  is  the  graph  of  a 
self-adjoint  operator. 

That  is  to  say,  A  is  self-adjoint  if  A*  and  A  are  the  same  operator  with 
the  same  domain.  Every  self-adjoint  or  essentially  self-adjoint  operator  is 
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symmetric,  but  not  every  symmetric  operator  is  essentially  self-adjoint. 
For  any  symmetric  operator,  Dom(A*)  D  Dom(A)  and  A*  agrees  with  A 
on  Dom(A).  The  reason  a  symmetric  operator  may  fail  to  be  self-adjoint  is 
that  Donr(A*)  may  be  strictly  larger  than  Dom(A). 

Although  the  condition  of  being  symmetric  is  certainly  easier  to 
understand  (and  to  verify)  than  the  condition  of  being  self-adjoint,  self¬ 
adjointness  is  the  “right”  condition.  In  particular,  the  spectral  theorem, 
which  is  essential  to  much  of  quantum  mechanics,  applies  only  to  operators 
that  are  self-adjoint  and  not  to  operators  that  are  merely  symmetric.  If  A 
is  essentially  self-adjoint,  then  we  can  obtain  a  self-adjoint  operator  from 
A  simply  by  taking  the  closure  of  the  graph  of  A,  and  we  can  then  apply 
the  spectral  theorem  to  this  self-adjoint  operator.  Thus,  for  may  purposes, 
it  is  enough  to  have  our  operators  be  essentially  self-adjoint  rather  than 
self-adjoint. 

It  is  generally  easy  to  verify  that  the  operators  of  quantum  mechanics 
(those  representing  position,  momentum,  and  so  forth)  are  symmetric  on 
some  suitably  chosen  domain.  Proving  that  these  operators  are  essentially 
self-adjoint,  however,  is  substantially  more  difficult.  Although  establishing 
essential  self-adjointness  is  a  crucial  technical  issue,  it  is  best  not  to  worry 
too  much  about  it  on  a  first  encounter  with  quantum  mechanics.  In  this 
chapter,  we  will  not  concern  ourselves  overly  with  technical  details  con¬ 
cerning  essential  self-adjointness  and  the  precise  choice  of  domain  for  our 
operators,  depending  on  Chap.  9  to  take  care  of  such  matters.  For  now,  we 
content  ourselves  with  deriving  some  very  elementary  properties  of  sym¬ 
metric  (and  thus  also  self-adjoint)  operators. 

Proposition  3.4  Suppose  A  is  a  symmetric  operator  on  H. 

1.  For  all  ip  G  Dom(A),  the  quantity  (ip,  Aip)  is  real.  More  generally,  if 
ip,  Aip, . . . ,  Am_1^  all  belong  to  Dom(A),  then  ( ip ,  Arnip)  is  real. 

2.  Suppose  A  is  an  eigenvector  for  A,  meaning  that  Aip  =  A  ip  for  some 
nonzero  ip  G  Donr(A).  Then  A  G  R. 

Proof.  Since  A  is  symmetric,  we  have 

{ip,  Aip)  =  {Aip,  ip)  =  {ip,  Aip) 

for  all  ip  G  Dom(j4).  If  ip,  Aip, . . . ,  A'"-1 ip  all  belong  to  the  domain  of  A, 
we  can  use  the  symmetry  of  A  repeatedly  to  show  that 

( ip,Amip )  =  ( Amip,ip )  =  (ip^A^ip). 

Meanwhile,  if  ip  is  an  eigenvector  for  A  with  eigenvalue  A,  then 

A  {ip,  ip)  =  {ip,  Aip)  =  {Aip,  ip)  =  A  {ip,  ip) . 

Since  ip  is  assumed  to  be  nonzero,  this  implies  that  A  =  A.  ■ 
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Physically,  (0,  A0)  represents — as  we  will  see  later  in  this  chapter- 
the  expectation  value  for  measurements  of  A  in  the  state  0,  whereas  the 
eigenvalue  A  represents  one  of  the  possible  values  for  this  measurement. 
On  physical  grounds,  we  want  both  of  these  numbers  to  be  real.  If  A  is 
self-adjoint,  and  not  just  symmetric,  then  the  spectral  theorem  will  give 
a  canonical  way  of  associating  to  each  0  £  H  a  probability  measure  on 
the  real  line  that  encodes  the  probabilities  for  measurements  of  A  in  the 
state  0. 


3.3  Position  and  the  Position  Operator 


Let  us  consider  at  first  a  single  particle  moving  on  the  real  line.  The  wave 
function  for  such  a  particle  is  a  map  0  :  M1  — »•  C.  Although  this  map  will 
evolve  in  time,  let  us  think  for  now  that  the  time  is  fixed.  The  function 
|0(x)|2  is  supposed  to  be  the  probability  density  for  the  position  of  the 
particle.  This  means  that  the  probability  that  the  position  of  the  particle 
belongs  to  some  set  E  C  M1  is 


0(x)|2  dx. 


For  this  prescription  to  make  sense,  0  should  be  normalized  so  that 


/  |0(x)  | 2  dx  =  F 

Jr 


That  is,  0  should  be  a  unit  vector  in  the  Hilbert  space  L2(M). 

I  2 

Now,  if  the  function  |0(x)|  is  the  probability  density  for  the  position  of 
a  particle,  then  according  to  the  standard  definitions  of  probability  theory, 
the  expectation  value  of  the  position  will  be 


E(x)  =  /  x|0(x)|2  dx, 

Jr 


(3.2) 


provided  that  the  integral  is  absolutely  convergent.  More  generally,  we  can 
compute  any  moment  of  the  position  (i.e. ,  the  expectation  value  of  some 
power  of  the  position)  as 


E(xm) 


assuming,  again,  the  convergence  of  the  integral. 

A  key  idea  in  quantum  theory  is  to  express  expectation  values  of  various 
quantities  (position,  momentum,  energy,  etc.)  in  terms  of  operators  and 
the  inner  product  on  the  relevant  Hilbert  space,  in  this  case,  L2(M).  In  the 
case  of  position,  we  may  introduce  the  position  operator  X  defined  by 

(X0)(x)  =  X0(x). 
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That  is,  X  is  the  “multiplication  by  xv  operator.  The  point  of  introducing 
this  operator  is  that  the  expectation  value  of  the  position  [defined  in  (3.2)] 
may  now  be  expressed  as 

E(x)  =  {^,X^), 

where  the  inner  product  is  the  usual  one  on  L2(M): 


<f(x)if(x)  dx. 


(Recall  that  we  are  following  the  physics  convention  of  putting  the  conju¬ 
gate  on  the  first  factor  in  the  inner  product.) 

We  use  the  following  notation  for  the  expectation  value  of  the  operator 
X  in  the  state  if: 

wy  :=  (vwv 

The  higher  moments  of  the  position,  as  defined  in  (3.3),  are  also  computable 
in  terms  of  the  position  operator: 


E(xm)  =  {if,Xmif) 


At  this  point,  it  is  not  clear  that  we  have  gained  anything  by  writing 

our  moments  in  terms  of  an  operator  and  the  inner  product  instead  of  in 

terms  of  the  integral  (3.3).  The  operator  description  will,  however,  motivate 

a  parallel  description  of  moments  for  the  momentum,  energy,  or  angular 

momentum  of  a  particle  in  terms  of  corresponding  operators. 

It  should  be  noted  that,  for  a  given  if  E  L2(M),  Xif  might  fail  to  be  in 

L2(M).  This  failure  of  X  to  be  defined  on  all  of  our  Hilbert  space  reflects 

that  X  is  an  unbounded  operator,  something  that  we  discussed  briefly  in 

Sect.  3.2.  Even  if  Xif  is  in  L2(M),  Xmif  might  fail  to  be  in  L2(M)  for  some 

m.  Nevertheless,  for  any  unit  vector  if  in  L2(M),  we  have  a  well-defined 

2 

probability  density  on  R,  given  by  | if(x) 


3.4  Momentum  and  the  Momentum  Operator 

At  any  fixed  time,  the  wave  function  if(x)  of  a  particle  (according  to  the 
wave  theory  postulated  by  Schrodinger)  is  a  function  of  a  “position”  vari¬ 
able  x  only.  Although  the  wave  function  if  directly  encodes  the  probabilities 
for  the  position  of  the  particle,  through  \if(x)\2  ,  it  is  not  as  clear  how  in¬ 
formation  about  the  particle’s  momentum  is  encoded.  As  it  turns  out,  the 
momentum  is  encoded  in  the  oscillations  of  the  wave  function.  A  crucial 
idea  in  quantum  mechanics  is  the  de  Broglie  hypothesis ,  which  we  intro¬ 
duced  in  Sect.  1.2.2  as  a  way  of  understanding  the  allowed  energies  in  the 
Bohr  model  of  the  hydrogen  atom.  The  de  Broglie  hypothesis  proposes 
a  particular  relationship  between  the  frequency  of  oscillation  of  the  wave 
function — as  a  function  of  position  at  a  fixed  time — and  its  momentum. 
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Proposition  3.5  (de  Broglie  hypothesis)  If  the  wave  function  of  a 
particle  has  spatial  frequency  k ,  then  the  momentum  p  of  the  particle  is 

p  =  hk ,  (3-4) 


where  h  is  Planck’s  constant . 

The  Davisson-Germer  electron-diffraction  experiments,  described  in  Sect. 
1.2.3,  strongly  support  not  only  the  idea  that  electrons  have  wavelike 
behavior,  but  also  the  specific  relationship  (3.4)  between  the  momentum 
of  an  electron  and  the  spatial  frequency  of  the  associated  wave.  Of  course, 
Proposition  3.5  is  rather  vague.  To  be  a  bit  more  precise,  Proposition  3.5  is 
supposed  to  mean  that  a  wave  function  of  the  form  if(x)  =  elkx  represents 
a  particle  with  momentum  p  =  Hk.  [Here,  as  in  Chap.  2,  “frequency”  is  in 
the  angular  sense.  The  cycles-per-unit-distance  frequency  is  v  =  k/( 2n). 

Now,  the  function  elkx  is  obviously  not  square  integrable,  so  it  is  not 
strictly  possible  for  the  wave  function  [which  is  supposed  to  satisfy  (3.1)] 
to  be  elkx .  Let  us  therefore  briefly  switch  to  thinking  of  a  particle  on  a  circle, 
so  that  we  can  avoid  certain  technicalities.  We  think  of  the  wave  function 
for  a  particle  on  a  circle  as  a  27r-periodic  function  on  R,  satisfying  the 
normalization  condition 


if(x)\  dx  =  1. 


For  any  integer  /c,  it  makes  sense  to  say  that  the  normalized  wave  function 
f)(x)  =  elkx / a/27t  represents  a  particle  with  momentum  p  =  Hk.  In  this  case, 
we  are  supposed  to  think  that  the  momentum  of  the  particle  is  definite, 
that  is,  nonrandom.  If  the  particle’s  wave  function  is  elkx  j ,  then  a 
measurement  of  the  particle’s  momentum  should  (with  probability  1)  give 
the  value  Hk. 

Now,  the  functions  elkx  /V2tt,  k  E  Z,  form  an  orthonormal  basis  for  the 
Hilbert  space  of  27r-periodic,  square-integrable  functions,  which  may  be 
identified  with  L2([0,  2tt]  ) .  Thus,  the  typical  wave  function  for  a  particle  on 
a  circle  is 


fj(x) 


oo 

y: ak 

k=  —  oo 


where  the  sum  is  convergent  in  L2([0,  2tt]  ) .  If  fj  is  normalized  to  be  a  unit 
vector,  then  we  have 


E 

k=  —  oo 


L2{1  0,2tt])  “  1 


For  a  particle  with  wave  function  given  by  (3.5),  the  momentum  of  the 
particle  is  no  longer  definite.  Rather,  we  are  supposed  to  think  that  a 
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measurement  of  the  particle’s  momentum  will  yield  one  of  the  values  Hk, 
fc  G  Z,  with  the  probability  of  getting  a  particular  value  hk  being  |a&|2  . 
Following  elementary  probability  theory,  then,  the  expectation  values  for 
the  momentum  should  be 


oo 


E(p)  =  Hk 


k=  —  oc 


and  higher  moments  for  the  momentum  should  be 


oo 


E(Pm )  =  Y 


m 


dk 


(3.7) 


(3.8) 


k—  —  oo 


assuming  absolute  convergence  of  the  sum. 

We  would  like  to  encode  the  moment  conditions  (3.7)  and  (3.8)  in  a 
momentum  operator  P,  which  should  be  defined  in  such  a  way  that  if  the 
particle’s  wave  function  0  is  given  by  (3.5),  then  E{prn)  =  (0,  Pm0)  . 
We  can  achieve  this  relation  if  P  satisfies 


Peikx  =  hkeikx, 


since  then, 


oo 


=  Y  m2  =  E(pm ) 


k=  —  oo 


The  (presumably  unique)  choice  for  P  satisfying  (3.9)  is 


(3.9) 

(3.10) 


dx 

Returning  now  to  the  setting  of  the  real  line,  it  is  natural  to  postu¬ 
late  that  the  momentum  operator  P  on  the  line  should  also  be  given  by 
P  =  —ih  d/dx.  This  operator  satisfies  the  relation 

Peikx  =  (Hk)eikx, 


which  is  supposed  to  capture  the  idea  that  the  wave  function  elkx  has 
momentum  Hk.  Although  the  function  elkx  is  not  square- integr able  with  re¬ 
spect  to  x,  the  Fourier  transform  allows  us  to  build  up  any  square-integrable 
function  as  a  “superposition”  of  functions  of  the  form  elkx .  ( Superposition 
is  the  term  physicists  use  for  a  linear  combination  or  the  continuous  analog 
thereof,  namely  an  integral.)  This  means  that  [by  analogy  to  (3.5)]  we  have 

elkxij)(k)  dk , 

where  0(fc)  is  the  Fourier  transform  of  0,  defined  by 

e~lkXrip(x)  dx. 


(3.11) 


(3.12) 
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(See  Appendix  A. 3. 2  for  information  about  the  Fourier  transform.) 

The  Plancherel  theorem  (Theorem  A.  19)  then  tells  us  that  the  Fourier 
transform  is  a  unitary  map  of  L2(M)  onto  L2(M).  Thus,  for  any  unit  vector 

V>  e  l2(r), 


dx 


In  light  of  what  we  have  in  the  circle  case,  it  is  natural  to  think  that  \ip(k)\2 
is  essentially  the  probability  density  for  the  momentum  of  the  particle. 

A 

(To  be  precise,  \ip(k)\2  is  the  probability  density  for  p/h.) 

We  can  now  express  the  properties  of  the  momentum  operator  entirely 
within  the  Hilbert  space  L2(M),  without  making  explicit  mention  of  the 
non-square-integrable  functions  elkx . 


Proposition  3.6  Define  the  momentum  operator  P  by 


d 

dx  ’ 


Then  for  all  sufficiently  nice  unit  vectors  ip  in  L2(M),  we  have 


(ip,Pmip) 


0 hk)m  $(k) 


dk 


(3.13) 


for  all  positive  integers  m.  The  quantity  in  (3.13)  is  interpreted  as  the 
expectation  value  of  the  mth  power  of  the  momentum ,  E{prn). 

Equation  (3.13)  should  be  compared  to  (3.10)  in  the  case  of  the  circle. 
Proof.  If  ip  is  in,  say,  the  Schwartz  space  (Definition  A.  15),  then,  by  ap¬ 
plying  Proposition  A.  17  m  times,  we  see  that  the  Fourier  transform  of  the 

nth  derivative  of  ip  is  (z/c)m?/;(/c),  and  so  the  Fourier  transform  of  P rnip  is 

/\ 

{hkPj^ippk).  Meanwhile,  since  the  Fourier  transform  is  unitary,  we  have 

/oo  _ 

^{k)(hk)m^{k)  dk , 

-OO 


which  gives  (3.13).  (The  assumption  that  ip  be  in  the  Schwartz  space  is 
stronger  than  necessary.  The  reader  is  invited  to  use  integration  by  parts 
and  the  definition  of  the  Fourier  transform  to  find  weaker  assumptions  that 
allow  the  same  conclusion.)  ■ 


3.5  The  Position  and  Momentum  Operators 

In  the  following  definition,  we  summarize  what  we  have  learned,  in  the  two 
previous  sections,  about  the  position  and  momentum  operators. 
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Definition  3.7  For  a  particle  moving  in  R1,  let  the  quantum  Hilbert  space 
be  L2(M)  and  define  the  position  and  momentum  operators  X  and  P 
by 

Xip(x)  =  xip(x) 

Ptp(x)  =  -ifdfi. 

dx 

Neither  the  position  nor  the  momentum  operator  is  defined  as  mapping 
the  entire  Hilbert  space  L2(M)  into  itself.  After  all,  for  E  L2(M),  the 
function  xfifx)  may  fail  to  be  in  L2(M).  Similarly,  a  function  in  L2(M)  may 
fail  to  be  differentiable,  and  even  if  it  is  differentiable,  the  derivative  may  fail 
to  be  in  L2(M).  What  this  means  is  that  X  and  P  are  unbounded  operators, 
of  the  sort  discussed  briefly  in  Sect.  3.2.  They  are  defined  on  suitable  dense 
subspaces  Dom(X)  and  Dom(P)  of  L2(M).  We  defer  a  detailed  examination 
of  the  domains  of  these  operators  until  Chap.  9. 

A  vitally  important  property  of  this  pair  of  operators  is  that  they  do  not 
commute. 


Proposition  3.8  The  position  and  momentum  operators  X  and  P  do  not 
commute,  but  satisfy  the  relation 

XP  -  PX  =  ihl,  (3.14) 


This  relation  is  known  as  the  canonical  commutation  relation. 
Proof.  Using  the  product  rule  we  calculate  that 


PXfi 


(xfi(x)) 


—iHfi(x)  —  ihx 


dfi 

dx 


—iHfi(x)  +  XPfi, 


from  which  (3.14)  follows.  ■ 

There  are  many  important  consequences  of  the  relation  (3.14),  which  we 
will  examine  at  length  in  Chaps.  11-  14  of  the  book.  For  now,  we  simply  note 
a  parallel  between  (3.14)  and  the  Poisson  bracket  relationship  in  classical 
mechanics:  {x,p}  =  1,  as  follows  directly  from  the  definition  of  the  Poisson 
bracket.  This  hints  at  an  analogy,  which  we  will  explore  further  in  Sect.  3.7, 
between  the  commutator  of  two  operators  A  and  B  on  the  quantum  side 
(namely,  the  operator  AB  —  BA)  and  the  Poisson  bracket  of  two  functions 
/  and  g  on  the  classical  side. 

Proposition  3.9  For  all  sufficiently  nice  functions  and  in  L2(M), 
we  have 

(4>,x^)  =  (X4>,^) 


and 


(</>,  pip)  =  {pp,  p)  ■ 
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Proof.  Suppose  that  0  and  0  belong  to  L2(M)  and  that  the  functions  xcj)(x) 
and  x'ip(x)  also  belong  to  L2(M).  Then  since  x  is  real,  we  have 


4>{x)x,ijj{x)  dx 


xc/)(x),ip(x)  dx , 


where  both  integrals  are  convergent  because  they  are  both  integrals  of  the 
product  of  two  L2  functions. 

Meanwhile,  for  the  second  claim,  let  us  assume  that  0  and  0  are  con¬ 
tinuously  differentiable  and  that  0(x)  and  0(x)  tend  to  zero  as  x  tends  to 
Too.  Let  us  also  assume  that  0,  0,  d<p/dx  and  di^>/dx  belong  to  L2(M).  We 
note  that  d<fi/dx  is  the  same  as  d<p/dx.  Thus,  using  integration  by  parts, 
we  obtain 


ih  f  <t>{x)^- 

A  dx 


dx  =  —ih  0(x)0(x) 


A 


A 


+  ih 


•A 


A 


d0 

dx 


0(0)  dx. 


Under  our  assumptions  on  0  and  0,  as  A  tends  to  infinity,  the  bound¬ 
ary  terms  will  vanish  and  the  remaining  integrals  will  tend  (by  dominated 
convergence)  to  integrals  over  the  whole  real  line.  Thus, 


which  is  the  second  claim  in  the  proposition.  ■ 

In  the  language  of  Definition  3.3,  Proposition  3.9  means  that  X  and  P 
are  symmetric  operators  on  certain  dense  subspaces  of  L2(M)  (the  space  of 
functions  for  which  the  proposition  is  proved).  It  is  actually  true  that  X 
and  P  are  essentially  self-adjoint  on  these  domains.  The  proof  of  essential 
self- adjoint  ness,  however,  will  have  to  wait  until  Chap.  9. 


3.6  Axioms  of  Quantum  Mechanics:  Operators 
and  Measurements 

In  this  section  we  consider  the  general  “axioms”  of  quantum  mechanics. 
These  axioms  are  not  to  be  understood  in  the  mathematical  sense  as  rules 
from  which  all  other  results  are  derived  in  a  strictly  deductive  fashion. 
Rather,  the  axioms  are  the  main  principles  of  how  quantum  mechanics 
works.  Here  we  look  at  the  “kinematic”  axioms,  those  that  apply  at  one 
fixed  time.  There  is  one  additional  axiom,  governing  the  time-evolution  of 
the  system,  which  we  consider  in  the  next  section. 
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Axiom  1  The  state  of  the  system  is  represented  by  a  unit  vector  if  in  an 
appropriate  Hilbert  space  H.  If  if  \  and  if  2  are  two  unit  vectors  in  H  with 
if 2  =  cifi  for  some  constant  c  E  C,  then  ifi  and  if 2  represent  the  same 
physical  state. 

The  Hilbert  space  H  is  frequently  called  the  “quantum  Hilbert  space.” 
This  does  not,  however,  mean  that  H  is  some  variant  of  the  notion  of  a 
Hilbert  space,  the  way  a  quantum  group  is  a  variant  of  the  notion  of  a 
group.  Rather,  “quantum  Hilbert  space”  means  simply,  “the  Hilbert  space 
associated  with  a  given  quantum  system.” 

In  Axiom  1,  it  should  be  noted  that  unit  vectors  in  H  actually  represent 
only  the  “pure  states”  of  the  theory.  There  is  a  more  general  notion  of  a 
“mixed  state”  (described  by  a  “density  matrix”)  that  we  will  consider  in 
Chap.  19.  We  will  follow  the  custom  in  most  physics  texts  of  considering  at 
first  only  pure  states. 

Axiom  2  To  each  real-valued  function  f  on  the  classical  phase  space  there 
is  associated  a  self-adjoint  operator  f  on  the  quantum  Hilbert  space. 

In  almost  all  cases,  the  operator  /  is  unbounded.  This  unboundedness 
is  unsurprising  when  we  realize  that  physically  relevant  functions  /  on 
the  classical  phase  space  (e.g.,  position  and  momentum)  are  unbounded 
functions.  In  the  unbounded  case,  the  notion  of  self-adjointness  is  rather 
technical;  see  Definition  3.3  in  Sect.  3.2.  In  most  applications,  it  is  not 
really  necessary  to  define  /  for  all  functions  on  the  classical  phase  space, 
but  only  for  certain  basic  functions,  such  as  position,  momentum,  energy, 

and  angular  momentum.  We  will  describe  the  quantizations  of  these  basic 

/\ 

functions  in  this  chapter.  If  one  really  needs  to  define  /  for  an  arbitrary 
function  /  (satisfying  some  regularity  assumptions),  the  standard  approach 
is  to  use  the  Weyl  quantization  scheme,  described  in  Chap.  13. 

For  a  particle  moving  in  R1,  the  classical  phase  space  is  M2,  which  we 
think  of  as  pairs  (x,p)  with  x  being  the  particle’s  position  and  p  being 
its  momentum.  The  quantum  Hilbert  space  in  this  case  is  usually  taken 

to  be  L2(M)  [not  L2(M2)].  In  that  case,  if  the  function  /  in  Axiom  2  is 

/\ 

the  position  function,  f(x,p)  =  x,  then  the  associated  operator  /  is  the 

position  operator  A,  given  by  multiplication  by  x.  If  /  is  the  momentum 

/\ 

function,  f(x,p)  =  p ,  then  /  is  the  momentum  operator  P  =  —ih  d/dx. 

In  the  physics  literature,  a  function  /  on  the  classical  phase  space  is  called 
a  classical  observable ,  meaning  that  it  is  some  physical  quantity  that  could 
be  observed  by  taking  a  measurement  of  the  system.  The  corresponding 
operator  /  is  then  called  a  quantum  observable. 

Axiom  3  If  a  quantum  system  is  in  a  state  described  by  a  unit  vector 
ife  H,  the  probability  distribution  for  the  measurement  of  some  observable 
f  satisfies 

e(d  =  fp,  ard) . 


(3.15) 
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In  particular,  the  expectation  value  for  a  measurement  of  f  is  given  by 


iPUi’)  • 


(3.16) 


Note  that  we  have  adopted  the  point  of  view  that  even  in  a  quantum 
mechanical  system,  what  one  is  measuring  is  the  classical  observable  /. 

In  the  quantum  case,  however,  /  no  longer  has  a  definite  value,  but  only 

/\ 

probabilities,  which  are  encoded  by  the  quantum  observable  /  and  the 
vector  if  G  H. 

If  ^  is  a  nonzero  vector  in  H  but  not  a  unit  vector,  then  (3.16)  should 
be  replaced  by 

where  if  :=  if  /  H^H  is  the  unit  vector  associated  with  if.  It  is  convenient  to 
assume  that  our  vectors  have  been  normalized  to  be  unit  vectors,  simply 

to  avoid  having  to  divide  by  (if,  if)  in  our  expectation  values. 

/\ 

Since  /  is  assumed  to  be  self-adjoint  and  every  self-adjoint  operator  is 
symmetric,  Proposition  3.4  tells  us  that  the  moments  E(/m),  and  in  partic- 
ular  the  expectation  value  E(/),  are  real  numbers.  Since  /  is  assumed  to  be 
self-adjoint  and  not  just  symmetric,  the  spectral  theorem  (Chaps.  7  and  10) 
will  give  a  canonical  way  of  constructing  a  probability  measure  Pa,p  on  R 
that  may  be  interpreted  as  the  probability  distribution  for  measurements 
of  A  in  the  state  if. 

Axiom  3  provides  motivation  for  the  idea  that  two  unit  vectors  that  differ 


by  a  constant  represent  the  same  physical  state.  If  if 2  =  cif\  with 
then  for  any  operator  A,  we  have 

(AA1P2)  =  (cipi,A&Pi)  =  |c|2  (tpi,AA)  = 


c 


=  1, 


Thus,  the  expectation  values  of  all  observables  are  the  same  in  the  state 
if 2  as  in  the  state  if\. 


Notation  3.10  If  A  is  a  self-adjoint  operator  on  H  and  if  E  H  is  a  unit 
vector,  the  expectation  value  of  A  in  the  state  if  is  denoted  (A)  ^  and  is 
defined  (in  light  of  Axiom  3)  to  be 

(A)^  =  ^AA-  (3-17) 

Proposition  3.11  (Eigenvectors)  If  a  quantum  system  is  in  a  state 
described  by  a  unit  vector  if  E  H  and  for  some  quantum  observable  f  we 

S\ 

have  fif  =  A  if  for  some  A  E  R,  then 

E(fm)  =  ((/T)  =  Am  (3.18) 

for  all  positive  integers  m.  The  unique  probability  measure  consistent  with 
this  condition  is  the  one  in  which  f  has  the  definite  value  A,  with  probabil¬ 
ity  one. 
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What  the  proposition  means  is  that  if  Q  is  an  eigenvector  for  /,  then 
measurements  of  /  for  a  particle  in  the  state  Q  are  not  actually  random, 

but  rather  always  give  the  answer  of  A.  If  fpj  =  A ip,  then  (/)mV^  = 

Am  ({?,  -0)  =  Am.  Thus,  by  (3.15),  we  want  to  find  a  probability  measure  p 
on  R  such  that 


d/i  =  X171, 


(3.19) 


for  all  non-negative  integers  m.  The  proposition  is  claiming  that  there  is 
one  and  only  one  such  measure,  namely  the  5-measure  at  the  point  A. 

Because  /  is  assumed  to  be  self-adjoint  and  therefore  symmetric,  Propo- 

/\ 

sition  3.4  thus  tells  us  that  the  every  eigenvalue  for  /  is  real. 

Proof.  The  relation  (3.18)  follows  from  (3.15)  and  the  fact  that  fip  = 
Xip.  Meanwhile,  if  p  is  the  5-measure  at  A,  then  certainly  (3.19)  holds. 
Meanwhile,  since  the  mth  moment  grows  only  exponentially  with  m,  even 
the  most  elementary  uniqueness  results  for  the  moment  problem  show  that 
the  5-measure  is  the  only  measure  with  these  moments.  (See,  e.g.,  Theorem 
8.1  in  Chap.  4  of  [18].)  ■ 

If,  more  generally,  the  state  of  the  system  is  a  linear  combination  of 
eigenvectors  for  /,  measurements  of  /  will  no  longer  be  deterministic. 

Example  3.12  Suppose  f  has  an  orthonormal  basis  {ej}  of  eigenvectors 
with  distinct  (real)  eigenvalues  A j.  Suppose  also  that  ip  is  a  unit  vector  in 
H  with  the  expansion 

oo 

ip  —  ^^ajCj.  (3.20) 

3  = 1 

Then  for  a  measurement  in  the  state  ip  of  the  observable  /,  the  observed 
value  of  f  will  always  be  one  of  the  numbers  A  j.  Furthermore,  the  probability 
of  observing  the  value  A  j  is  given  by 


Prob  {/  =  A  j} 


(3.21) 


Assuming  that  ip  is  in  the  domain  of  (/)m,  it  is  easy  to  verify  that  the 
probabilities  in  (3.21)  are  consistent  with  the  expectation  values  given  in 
Axiom  3.  After  all,  if  ip  is  given  as  in  (3.20),  then  we  can  readily  calculate 
that  pip,  ( / )mip)  equals  ^  \&j\  A 171 ,  which  is  nothing  but  the  mth  moment 
associated  with  the  probability  distribution  in  (3.21).  In  general,  we  can¬ 
not  quite  derive  (3.21)  from  Axiom  3,  since  the  uniqueness  results  for  the 
moment  problem  might  not  apply.  Nevertheless,  (3.21)  is  the  most  natural 
candidate  for  the  probabilities,  and  we  will  assume  that  this  formula  holds. 

It  is  not  difficult  to  extend  Example  3.12  to  the  case  where  the  eigenvalues 
are  not  distinct:  For  any  sequence  {Ay }  of  eigenvalues,  the  probability  of 
observing  some  value  A  will  be  the  sum  of  \aj\2  over  all  those  values  of  j 
for  which  A  j  =  A.  For  any  self-adjoint  operator  A ,  the  spectral  theorem 
implies  that  A  has  either  an  orthonormal  basis  of  eigenvectors  or  some 
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continuous  analog  thereof.  In  particular,  given  a  self-adjoint  operator  A 
and  a  unit  vector  ip  E  H,  the  spectral  theorem  will  give  us  a  probability 
measure  on  R  that  we  interpret  as  describing  the  probabilities  for  a 
measurement  of  A  in  the  state  ip.  See  Proposition  7.17  in  the  bounded  case 
and  Definition  10.7  in  the  unbounded  case. 

Axiom  4  Suppose  a  quantum  system  is  initially  in  a  state  ip  and  that  a 
measurement  of  an  observable  f  is  performed.  If  the  result  of  the  measure¬ 
ment  is  the  number  A  E  R,  then  immediately  after  the  measurement ,  the 
system  will  be  in  a  state  ip'  that  satisfies 

/>'  =  AV/. 

The  passage  from  ip  to  ip'  is  called  the  collapse  of  the  wave  function.  Here 

S\ 

f  is  the  self-adjoint  operator  associated  with  f  by  Axiom  2. 

Let  us  assume  again  that  /  has  an  orthonormal  basis  of  eigenvectors  {ej} 
with  distinct  eigenvalues  A j .  Then  we  can  say,  more  specifically,  that  if  we 
observe  the  value  \3  in  a  measurement  of  /  (and  we  will  always  observe 
one  of  the  Aj’s)  then  ip'  =  e3.  That  is,  the  measurement  “collapses”  the 
wave  function  by  throwing  away  all  the  components  of  ip  in  the  direction 
of  the  efc’s,  except  the  one  with  k  =  j. 

This  idea  of  the  collapse  of  the  wave  function  has  generated  an  enormous 
amount  of  discussion  and  controversy.  One  way  to  look  at  the  situation  is 
to  think  that  the  wave  function  ip  is  not  actually  the  state  of  the  system- 
although  we  continue  to  use  the  standard  physics  term,  “state.”  Rather, 
the  wave  function  is  the  thing  that  encodes  the  probabilities  for  the  state  of 
the  system.  The  collapse  of  the  wave  function  is  then  something  similar  to 
a  conditional  probability;  the  probabilities  for  future  measurements  of  the 
system  should  be  consistent  with  the  outcome  of  the  measurement  we  just 
made.  Paul  Dirac  has  described  the  collapse  of  the  wave  function  as  being 
not  a  discontinuous  change  in  the  state  of  the  system,  but  a  discontinuous 
change  in  our  knowledge  of  the  state  of  the  system. 

In  any  case,  Axiom  4  guarantees  the  following  reasonable  principle:  If 
we  measure  /  and  then  measure  /  again  a  very  short  time  later,  the  result 
of  the  second  measurement  will  agree  with  the  result  of  the  first  measure¬ 
ment.  Thus,  immediately  after  the  first  measurement,  the  probabilities  for 
a  second  measurement  of  /  are  not  those  associated  with  the  vector  ip,  but 
rather  those  associated  with  the  state  ip' .  (Since  ip'  is  an  eigenvector  for  / 
with  eigenvalue  A,  Proposition  3.11  tells  us  that  measurements  of  /  in  the 
state  ip'  always  give  the  value  of  A.) 

Note  that  Axiom  4  only  tells  us  something  about  the  state  of  the  system 
immediately  after  a  measurement.  Following  the  measurement,  the  state  of 
the  system  will  evolve  in  time  in  the  usual  way  (Sect.  3.7).  A  significant 
time  after  the  measurement,  then,  the  system  will  probably  no  longer  be 
in  the  state  ip' . 
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Let  us  conclude  this  section  by  considering  an  example  of  how  one  makes 

a  measurement  of  a  real-world  physical  system,  namely,  the  hydrogen  atom. 

/\ 

The  Hamiltonian  operator  H  for  a  hydrogen  atom  has  negative  eigenvalues 
of  the  form 

-  4>  (3-22) 

where  R  is  the  Rydberg  constant  and  n  =  1,2,3,...  These  energies  will  be 
derived  in  Chap.  18.  Negative  eigenvalues  are  of  greater  interest  than  posi¬ 
tive  ones,  because  negative  eigenvalues  describes  states  where  the  electron 
is  bound  to  the  nucleus.  If  an  electron  is  placed  into  a  state  having  energy 
— R/n\ ,  with  m  >  1,  it  will  eventually  “decay”  into  a  state  with  lower 
energy,  say,  — R/n |,  with  n2  <  n\.  (The  most  readily  observed  cases  are 
those  with  n2  =  2  and  n2  =  1.)  In  the  process  of  decaying,  the  electron 
emits  a  photon,  with  the  energy  of  the  photon  being  equal  to  the  change 
in  energy  of  the  electron,  namely, 


E 


photon 


R__R_ 

n2  n\ 


(3.23) 


Meanwhile,  the  frequency  of  the  photon  is  proportional  to  its  energy.  Thus, 
by  observing  the  frequency  of  the  emitted  photon,  one  can  determine  the 
change  in  energy  of  the  electron  and  thus  determine  the  values  of  n\  and  ri2- 

A  general  “bound  state”  of  the  hydrogen  atom  (a  state  in  which  the 
electron  is  bound  to  the  nucleus),  will  be  a  linear  combination  of  eigenvec¬ 
tors  for  H  with  various  different  eigenvalues  of  the  form  (3.22).  To  measure 
the  energy  of  the  electron,  we  simply  wait  for  the  electron  to  decay  into  a 
lower-energy  state  and  emit  a  photon,  observe  the  frequency  of  the  photon, 
and  work  backwards  to  the  energy  of  the  electron.  If  we  consider  many 
“identically  prepared”  electrons,  all  having  the  same  wave  function  that 
is  a  linear  combination  of  eigenvectors,  we  will  observe  many  different  fre¬ 
quencies  for  the  emitted  photons,  and  thus  many  different  energies  for  the 
electron.  The  probabilities  for  the  observed  energies  of  the  electron  will 
follow  the  principle  spelled  out  in  Example  3.12. 

In  basic  probability  theory,  if  Y  is  a  random  variable  then  the  variance 
a2  of  Y  is  computed  as 


a2  =  E  [(Y  -  E(Y))2}  , 

where  E  denotes  the  mean  or  expectation  value  of  a  random  variable.  The 
standard  deviation  cr  :=  vV  is  a  measure  of  the  “typical”  deviation  from 
the  mean  E(X).  Observe  that  the  variance  may  be  computed  as 

a2  =  E  [Y2  -  2E(Y)Y  +  E(Y)2] 

=  E(Y2)  -  2E(Y)2  +  E(Y)2 
=  E(Y2)  -  E(Y)2.  (3.24) 
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Definition  3.13  If  A  is  a  self-adjoint  operator  on  a  Hilbert  space  H  and 
if  is  a  unit  vector  in  H,  let  A^A  denote  the  standard  deviation  associated 
with  measurements  of  A  in  the  state  if,  which  is  computed  as 

We  refer  to  A^A  as  the  uncertainty  of  A  in  the  state  if. 

For  any  single  observable  A,  it  is  possible  to  choose  if  so  that  A^A 
is  as  small  as  we  like.  In  Chap.  12,  however,  we  will  see  that  when  two 
observables  A  and  B  do  not  commute,  then  A^A  and  A^B  cannot  both 
be  made  arbitrarily  small  for  the  same  if.  In  particular,  we  will  derive  there 
the  famous  Heisenberg  uncertainty  principle ,  which  states  that 

(A*V)(A *P)  >  f 

for  all  if  for  which  A^X  and  A^P  are  defined. 


3.7  Time-Evolution  in  Quantum  Theory 

3. 7. 1  The  Schrodinger  Equation 

Up  to  now,  we  have  been  considering  the  wave  function  if  at  a  fixed  time. 
We  now  consider  the  way  in  which  the  wave  function  evolves  in  time.  Recall 
that  in  the  Hamiltonian  formulation  of  classical  mechanics  (Sect.  2.5),  the 
time-evolution  of  the  system  is  governed  by  the  Hamiltonian  (energy)  func¬ 
tion  H,  through  Hamilton’s  equations.  According  to  Axiom  2,  there  is  a 
corresponding  self-adjoint  linear  operator  H  on  the  quantum  Hilbert  space 
H,  which  we  call  the  Hamiltonian  operator  for  the  system.  See  Sect.  3.7.4 
for  an  example. 

Recall  that  we  motivated  the  definition  of  the  momentum  operator  by 
the  de  Broglie  hypothesis ,  p  =  hk,  where  k  is  the  spatial  frequency  of  the 
wave  function.  We  can  similarly  motivate  the  time-evolution  in  quantum 
mechanics  by  a  similar  relation  between  the  energy  and  the  temporal  fre¬ 
quency  of  our  wave  function: 


E  =  Huj.  (3.25) 

This  relationship  between  energy  and  temporal  frequency  is  nothing  but  the 
relationship  proposed  by  Planck  in  his  model  of  blackbody  radiation  (Sect. 

1.1.3).  Suppose  that  a  wave  function  if o  has  definite  energy  E,  meaning 

/\ 

that  ifo  is  an  eigenvector  for  H  with  eigenvalue  E.  Then  (3.25)  means  that 
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the  time-dependence  of  the  wave  function  should  be  purely  at  frequency 
lu  =  E/h.  That  is  to  say,  if  the  state  of  the  system  at  time  t  =  0  is  ^o,  then 
the  state  of  the  system  at  any  other  time  t  should  be 


^(t)  =  e~iut^0  =  e~iEt/hip0. 

We  can  rewrite  (3.26)  as  a  differential  equation: 

dip  iE  E 

s  =  —i r*= 


(3.26) 


(3.27) 


Note  that  we  are  taking  “temporal  frequency  ce”  to  mean  that  the  time- 
dependence  is  of  the  form  e~lujt,  whereas  we  took  “spatial  frequency  /c”  to 
mean  that  the  space-dependence  is  of  the  form  elkx ,  with  no  minus  sign  in 
the  exponent.  This  curious  convention  is  convenient  when  we  look  at  pure 
exponential  solutions  to  the  free  Schrodinger  equation  (Chap.  4)  of  the  form 
exp [i(kx  —  uot)\,  which  describes  a  solution  moving  to  the  right  with  speed 
uj/k. 

Equation  (3.27)  tells  us  the  time-evolution  for  a  particle  that  is  initially 
in  a  state  of  definite  energy,  that  is,  an  eigenvector  for  the  Hamiltonian 
operator.  A  natural  way  to  generalize  this  equation  is  to  recognize  that  E ip 
is  nothing  but  Hip,  since  ip  is  just  a  multiple  of  ipo,  which  is  an  eigenvector 
for  H  with  eigenvalue  E.  Replacing  E  by  H  in  (3.27)  leads  to  the  following 
general  prescription  for  the  time-evolution  of  a  quantum  system. 


Axiom  5  The  time- evolution  of  the  wave  function  ip  in  a  quantum  system 
is  given  by  the  Schrodinger  equation, 


dip 

dt 


(3.28) 


Here  H  is  the  operator  corresponding  to  the  classical  Hamiltonian  H  by 
means  of  Axiom  2. 


Although  both  Hamilton’s  equations  and  the  Schrodinger  equation 
involve  a  Hamiltonian,  the  two  equations  otherwise  do  not  seem  parallel. 
Of  course,  since  quantum  mechanics  is  not  classical  mechanics,  we  should 
not  expect  the  two  theories  to  have  the  same  time-evolution.  Neverthe¬ 
less,  we  might  hope  to  see  some  similarities  between  the  time-evolution  of 
a  classical  system  and  that  of  the  corresponding  quantum  system.  Such 
a  similarity  can  be  seen  when  we  consider  how  the  expectation  values  of 
observables  evolve  in  quantum  mechanics. 


Proposition  3.14  Suppose  ip(t)  is  a  solution  of  the  Schrodinger  equation 
and  A  is  a  self-adjoint  operator  on  H.  Assuming  certain  natural  domain 
conditions  hold,  we  have 


d_ 

dt 


(3.29) 
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where  (A)^  is  as  in  Notation  3.10  and  where  [•, 
defined  as 

[. A ,  B\  =  AB  —  BA. 


denotes  the  commutator , 


Equation  (3.29)  should  be  compared  to  the  way  a  function  /  on  the  clas¬ 
sical  phase  space  evolves  in  time  along  a  solution  of  Hamilton’s  equations: 
df/dt  =  {/,  H}.  We  see,  then,  that  the  commutator  of  operators  (divided 
by  ih)  plays  a  role  in  quantum  mechanics  similar  to  the  role  of  the  Poisson 
bracket  in  classical  mechanics. 

Proof.  Let  if(t)  be  a  solution  to  the  Schrodinger  equation  and  let  us  com¬ 
pute  at  first  without  worrying  about  domains  of  the  operators  involved.  If 
we  use  the  product  rule  (Exercise  1)  for  differentiation  of  the  inner  product, 
we  obtain 


d_ 

dt 


(ip(t),Aip(t)) 


dt 


dt 


5  {*’  I-4’  "»}■ 


where  in  the  last  step  we  have  used  the  self-adjointness  of  H  to  move  it 
to  the  other  side  of  the  inner  product.  Recall  that  we  are  following  the 
convention  of  putting  the  complex  conjugate  on  the  first  factor  in  the  inner 
product,  which  accounts  for  the  plus  sign  in  the  first  term  on  the  second 

line.  Rewriting  this  using  Notation  3.10  gives  the  desired  result. 

/\ 

If  A  and  H  are  (as  usual)  unbounded  operators,  then  the  preceding 
calculation  is  not  completely  rigorous.  Since,  however,  we  are  deferring  a 
detailed  examination  of  issues  of  unbounded  operators  until  Chap.  9,  let 

us  simply  state  the  conditions  needed  for  the  calculation  to  be  valid.  For 

/\ 

every  t  E  R,  we  need  to  have  fj(t)  E  Dom(A)  D  Dom(if),  we  need  Afj(t)  E 
Dom(if),  and  we  need  Hi/j(t)  E  Dom(A).  (These  conditions  are  needed  for 
[A,H\ijj(t)  to  be  defined.)  In  addition,  we  need  Aip(t)  to  be  a  continuous 
path  in  H.  ■ 

Note  that  to  see  interesting  behavior  in  the  time-evolution  of  a  quantum 
system,  there  has  to  be  noncommutativity  present.  If  all  the  physically 
interesting  operators  A  commuted  with  the  Hamiltonian  operator  if,  then 
[if,  A]  would  be  zero  and  the  expectation  values  of  these  operators  would 
be  constant  in  time.  Noncommutativity  of  the  basic  operators  is  therefore 
an  essential  property  of  quantum  mechanics.  In  the  case  of  a  particle  in 
R1,  noncommutativity  is  built  into  the  commutation  relation  for  X  and  P, 
given  in  Proposition  3.8. 

Although  it  is  not  reasonable  to  have  all  physically  interesting  opera- 

tors  commute  with  if,  there  may  be  some  operators  with  this  property.  If 

/\ 

[A,  H ]  =  0,  then  the  expectation  value  of  A  (and,  indeed,  all  the  moments 
of  A)  is  independent  of  time  along  any  solution  of  the  Schrodinger  equation. 
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We  may  therefore  call  such  an  operator  A  a  conserved  quantity  (or  constant 
of  motion).  Just  as  in  the  classical  setting,  conserved  quantities  (when  we 
can  find  them)  are  helpful  in  understanding  how  to  solve  the  Schrodinger 
equation. 

Proposition  3.14  suggests  that  the  map 

(A,B)  i — >  I[A,B\, 

where  A  and  B  are  self-adjoint  operators,  plays  a  role  similar  to  that  of  the 
Poisson  bracket  in  classical  mechanics.  This  analogy  is  supported  by  the 
following  list  of  elementary  properties  of  the  commutator,  which  should  be 
compared  to  the  properties  of  the  Poisson  bracket  listed  in  Proposition  2.23. 

Proposition  3.15  For  any  vector  space  V  over  C  and  linear  operators  A , 
B ,  and  C  on  V,  the  following  relations  hold. 

1.  [A,  B  -f  aC]  =  [ A ,  B]  A  a[A ,  C]  for  all  a  E  C 

2.  [B,A]  =  -[A,B\ 

3.  [A,  BC }  =  [A,  B]C  +  B[A,  C ] 

4-  [■ A,[B,C ]]  =  [[. A,B],C]  +  [B,[A,C ]] 

Property  4  is  equivalent  to  the  Jacobi  identity , 

[A,  [B,  C\]  +  [B,  [C,  A]]  +  [C,  [A,  B]}  =  0,  (3.30) 

as  can  easily  be  seen  using  the  skew-symmetry  of  the  commutator. 

Proof.  The  first  two  properties  of  the  commutator  are  obvious,  and  the 
third  is  easily  verified  by  writing  things  out.  Property  4  can  also  be  proved 
by  writing  things  out,  but  it  is  slightly  messier.  Each  of  the  three  double 
commutators  on  the  left-hand  side  of  (3.30)  generates  four  terms,  for  a  total 
of  12  terms.  Each  term  has  the  operators  A,  B ,  and  C  multiplied  together 
in  some  order.  It  is  a  straightforward  but  unenlightening  calculation  to 
verify  that  each  of  the  six  possible  orderings  of  A ,  B,  and  C  occurs  twice, 
with  opposite  signs.  ■ 

If  A  and  B  are  bounded  self-adjoint  operators  on  some  Hilbert  space, 
then  it  is  straightforward  to  check  that  (1  f(ih))[A,B]  is  again  self-adjoint 
(Exercise  3).  If  A  and  B  are  unbounded  self-adjoint  operators,  then  the 
operator  (1  /(ih))[A,B]  will  be  self-adjoint  under  suitable  assumptions  on 
the  domains  of  A  and  B. 

Proposition  3.16  If  <f(t)  and  jj(t)  are  solutions  to  the  Schrodinger  equa¬ 
tion  (3.28),  the  quantity  {<f{t),jj{f))  is  independent  oft.  In  particular, 
|'0(£)||  is  independent  oft,  for  any  solution  i((t)  of  the  Schrodinger  equation. 
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Proof.  Using  again  the  product  rule,  we  have 

jt  W)>  = 

=  -T^  (AV)V(Q  +  “T  (4>{t),Hip{t)^ 


Since  H  is  self-adjoint,  we  can  move  H  to  the  other  side  of  the  inner  product 
and  the  derivative  is  equal  to  0.  ■ 

3.7.2  Solving  the  Schrodinger  Equation  by  Exponentiation 

The  Schrodinger  equation  is  an  example  of  a  equation  of  the  form 

dv  .  ,  _ . 

dt=  V’  ^  ^ 

where  A  is  a  linear  operator  on  a  Hilbert  space.  (In  the  Schrodinger  case, 
we  have  A  =  —  ( i/h)H .)  Let  us  think  of  (3.31)  in  the  case  where  the  Hilbert 
space  is  the  finite-dimensional  space  Cn.  In  that  case,  we  can  think  of  A  as 
an  n  x  n  matrix,  in  which  case  (3.31)  is  the  sort  of  equation  encountered 
in  the  elementary  theory  of  ordinary  differential  equations.  The  solution  of 
this  system  (in  the  finite-dimensional  case)  can  be  expressed  as 

v(t)  =  etAv0, 

where  the  matrix  exponential  etA  is  defined  by  a  convergent  power  series 
and  where  Vo  =  v(0)  is  the  initial  condition.  If  A  is  diagonalizable,  then 
the  exponential  can  by  computed  by  using  a  basis  of  eigenvectors.  (See 
Sect.  16.4  for  more  information.) 

The  Schrodinger  equation  simply  replaces  Cn  by  a  Hilbert  space  H  and 
the  matrix  A  by  the  linear  operator  —  {i/K)H. 

Claim  3.17  Suppose  H  is  a  self-adjoint  operator  on  H.  If  a  reasonable 
meaning  can  be  given  to  the  expression  e~ltH/h1  then  the  Schrodinger  equa¬ 
tion  can  be  solved  by  setting 

S(t)  =  e~lt]^^nSo‘  (3.32) 

To  see  why  the  claim  should  be  true,  we  expect  that  we  can  differentiate 
the  operator- valued  expression  e~ltHE  with  respect  to  t  as  we  would  in  the 
finite-dimensional  case.  The  differentiation,  then,  would  pull  down  a  factor 
of  — iH/h ,  which  would  indicate  that  ^(t)  indeed  solves  the  Schrddinger 

equation.  Furthermore,  when  t  =  0,  e~ltHih  should  be  equal  to  /,  so  that 
^(0)  is  indeed  ipo- 

If  H  is  a  bounded  operator  (which  is  rarely  the  case),  then  the  expo¬ 
nential  e~ltHin  can  be  defined  by  a  convergent  power  series,  precisely  as 
in  the  finite-dimensional  case.  In  that  case,  Claim  3.17  is  an  easily  proved 
theorem. 
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In  the  more  typical  case  where  H  is  unbounded,  convergence  of  the  series 
for  the  exponential  is  a  rather  delicate  matter,  and  it  is  better  instead  to 
use  the  spectral  theorem.  We  leave  a  general  discussion  of  the  spectral 

theorem  to  Chaps.  7  and  10,  and  here  consider  only  the  case  of  a  pure 

/\ 

point  spectrum.  A  (possibly  unbounded)  self-adjoint  operator  H  is  said  to 

have  a  pure  point  spectrum  if  there  exists  an  orthonormal  basis  {ej}  for  H 

/\  /\ 

consisting  of  eigenvectors  for  H.  If  Hej  =  Ej e3  for  some  Ej  E  R,  then  the 
exponential  can  be  defined  by  requiring  that 


-itH/n  .  _  -itEj/n  . 

C  Cj  —  o  Cj  . 


(3.33) 


The  operator  e-ztH/h  is  unitary  and  thus  bounded;  it  is  the  unique  bounded 
operator  on  H  satisfying  (3.33). 

It  is  not  precisely  true  that  every  self-adjoint  operator  has  an  orthonor¬ 
mal  basis  of  eigenvectors,  even  if  the  operator  is  bounded.  Nevertheless, 
given  a  self-adjoint  operator  A ,  the  spectral  theorem  tells  us  that  there  is  a 
decomposition  of  H  into  “generalized  eigenspaces”  for  A.  It  is,  however,  a 
bit  complicated  to  state  the  precise  sense  of  this  decomposition,  especially 
in  the  case  of  unbounded  operators.  Still,  Claim  3.17  allows  us  to  identify 
one  goal  for  the  spectral  theorem:  Whatever  the  spectral  theorem  says,  it 
ought  to  allow  us  to  make  sense  of  the  expression  eiaA ,  for  any  self-adjoint 
operator  A  and  real  number  a.  This  goal  will  indeed  be  realized,  in  the 
bounded  case  in  Chap.  7  and  in  the  unbounded  case  in  Chap.  10. 

We  should  add  two  points  of  clarification  regarding  the  expression  (3.32). 
First,  in  writing  (3.32),  we  have  not  “really”  solved  the  Schrodinger  equa¬ 
tion.  For  this  expression  to  be  useful,  we  need  to  compute  e~ltH^n  in  some 
relatively  explicit  way.  If,  for  example,  we  can  actually  compute  an  or- 
thonormal  basis  of  eigenvectors  for  i7,  then  in  light  of  (3.33),  we  are  on 

our  way  to  understanding  the  behavior  of  the  operator  e~ltH !h .  Second, 

/\ 

although  H  is  an  unbounded  operator,  which  is  not  defined  on  ah  of  H 
but  only  on  a  dense  subspace,  the  operator  e~UHtn  is  unitary  and  de¬ 
fined  on  ah  of  H.  Thus,  the  right-hand  side  of  (3.32)  makes  sense  for  any 

So  in  H.  Nevertheless,  we  cannot  expect  that  e  ltHihSo  actually  solves  the 
Schrodinger  equation  (in  the  natural  Hilbert  space  sense)  unless  So  belongs 
to  the  domain  of  H.  (See  Femma  10.17  in  Sect.  10.2.) 


3.7.3  Eigenvectors  and  the  Time- Independent  Schrodinger 
Equation 

As  we  saw  in  the  preceding  section,  eigenvectors  for  the  Hamiltonian  oper¬ 
ator  are  of  great  importance  in  solving  the  Schrodinger  equation.  In  light 
of  this  fact,  we  make  the  following  definition. 
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Definition  3.18  If  H  is  the  Hamiltonian  operator  for  a  quantum  system, 
the  eigenvector  equation 

Hif  =  Eif,  EgM,  (3.34) 

is  called  the  time-independent  Schrodinger  equation. 

As  always  in  eigenvector  equations,  we  are  trying  to  determine  both  the 
numbers  E  for  which  (3.34)  has  a  nonzero  solution  (the  eigenvalues)  and  the 
corresponding  vectors  if  (the  eigenvectors).  When  quantum  texts  speak  of 
“solving,”  say,  the  quantum  harmonic  oscillator,  what  they  usually  mean  is 
finding  all  of  the  solutions  to  the  time-independent  Schrodinger  equation. 
(See,  e.g.,  Chaps.  5  and  11.)  If  if  is  a  solution  to  the  time-independent 
Schrodinger  equation,  then  the  solution  to  the  time-  dependent  Schrodinger 
equation  with  initial  condition  if  is  simply  if(t)  =  e~ltE'hif.  Since  if(t)  is 
just  a  constant  multiple  of  if ,  we  see  that  if(t)  represents  the  same  physical 
state  as  if.  Thus,  a  solution  to  the  time-independent  Schrodinger  equation 
is  sometimes  called  a  stationary  state. 


3.7.4  The  Schrodinger  Equation  in  R1 

Let  us  now  consider  the  simplest  example  for  the  Hamiltonian  operator 
H.  For  a  particle  moving  in  M,  recall  (Sect.  3.5)  that  we  have  identified 
the  position  operator  X  as  being  multiplication  by  x  and  the  momentum 
operator  as  P  =  —ih  d/dx.  The  classical  Hamiltonian  for  such  a  particle 
is  typically  taken  to  be  of  the  form  H(x,p)  =  p2 / (2m)  +  V(x),  where  V  is 
the  potential  energy  function.  In  that  case,  we  may  reasonably  take 

*  A + 


Here  the  operator  V(X)  is  simply  multiplication  by  the  potential  energy 
function  V(x).  (This  operator  may  also  be  thought  of  as  the  function  V 
applied  to  the  operator  X  in  the  sense  of  the  functional  calculus  coming 
from  the  spectral  theorem.)  We  see,  then,  that 

^(*)  =  +V(x)'tP(.n-  (3-35) 


An  operator  of  the  form  (3.35),  or  an  analogously  defined  operator  in  higher 
dimensions,  is  referred  to  as  a  Schrodinger  operator.  (The  term  Hamilto¬ 
nian  operator  refers  more  generally  to  whatever  operator  governs  the  time- 
evolution  of  a  quantum  system,  regardless  of  its  form.) 

If  our  Hamiltonian  is  of  the  form  given  in  (3.35),  then  the  time-dependent 
Schrodinger  equation  takes  the  form 


dif(x ,  t) 


ih  d2if(x ,  t) 
2m  dx 2 


l-V(x)4>{x,t) 


1 


dt 


(3.36) 
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which  is  a  linear  partial  differential  equation.  By  contrast,  Newton’s 
equation  for  a  particle  in  M1  is  a  typically  nonlinear  ordinary  differential 
equation. 

For  a  particle  in  R1,  the  time-independent  Schrodinger  equation  is  an 
ordinary  differential  equation,  one  that  is  linear  but  that  has  nonconstant 
coefficients,  unless  V  happens  to  be  constant.  For  simple  examples  of  the 
potential  function  V,  there  are  relatively  standard  methods  of  ordinary 
differential  equations  that  can  be  brought  to  bear  on  the  time-independent 
Schrodinger  equation. 


3.7.5  Time- Evolution  of  the  Expected  Position 
and  Expected  Momentum 

Since  a  quantum  particle  does  not  have  a  fixed  position  or  momentum,  it 
does  not  make  sense  to  ask  whether  the  particle  satisfies  Newton’s  equation. 
It  does,  however,  make  sense  to  ask  whether  the  expected  values  of  the  po¬ 
sition  and  momentum  satisfy  Newton’s  equation  (in  the  form  of  Hamilton’s 
equations). 


Proposition  3.19  Suppose  x/j(t)  is  a  solution  to  the  Schrodinger  equa¬ 
tion  (3.36)  for  a  sufficiently  nice  potential  V  and  for  a  sufficiently  nice 
initial  condition  =  fo.  Then  the  expected  position  and  expected  mo¬ 
mentum  in  the  state  fj(t)  satisfy 


i  =  k  (P>*«>  <3-37) 

|  (P)m  =  ~  (v'(x))m  ■  O-38) 


The  assumptions  in  the  proposition  are  there  for  two  reasons:  First,  to  en- 

/\ 

sure  that  H  is  actually  a  self-adjoint  operator  (see  Sect.  9.9)  and  second,  to 
ensure  that  the  domain  assumptions  in  Proposition  3.14  are  satisfied.  If  we 
assume,  for  example,  that  V(x)  is  a  bounded-below  polynomial  in  x  and 
that  ifo  belongs  to  the  Schwartz  space  (A.  15),  then  both  of  these  concerns 
will  be  taken  care  of.  Once  these  technicalities  are  addressed,  the  proof  of 
Proposition  3.19  is  a  straightforward  application  of  Proposition  3.14;  see 
Exercise  4.  Note  that  (3.37)  says  that  in  a  certain  sense,  the  velocity  of  a 
quantum  particle  is  1/m  times  the  momentum,  just  as  in  the  classical  case. 

At  first  glance,  it  might  appear  that  the  pair  ((X)^^  ,  (P)^^)  is  a  solu¬ 
tion  to  Hamilton’s  equations,  and  indeed  (3.37)  is  precisely  what  Hamilton’s 
equations  require.  To  get  a  solution  to  Hamilton’s  equations,  however,  we 
would  need  the  right-hand  side  of  (3.38)  to  equal  —V'((X)^^).  But  in 
general, 

TOOh  *  v\(x)p. 


Consider,  for  example,  the  case  V'(x)  =  x3  +  x2 .  If  is  an  even  func¬ 
tion,  then  (X)^  =  0  and  so  V'((X)^)  =  0.  But  (A3  +  X2)^  will  not  be 
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zero,  because  the  X 3  term  will  be  zero  and  the  X 2  term  will  be  positive. 
We  conclude,  then,  that  (X)^^  and  (P)^^  usually  do  not  evolve  along 
solutions  to  Hamilton’s  equations. 

There  is,  however,  one  case  in  which  (V'(X))^  coincides  with  V'((X )^), 
and  that  is  the  case  in  which  V  is  quadratic,  in  which  case  V'  is  linear.  In 
that  case  we  have 

wpoy  =  (ax + m >„  =  a  (. x ^ + b  =  neoy  • 

Thus,  the  expected  position  and  expected  momentum  do  follow  classical 
trajectories  in  the  case  of  a  quadratic  potential.  It  is  not  surprising  that 
this  case  is  special  in  quantum  mechanics,  since  it  is  also  special  in  classical 
mechanics;  this  is  the  case  in  which  Newton’s  law  is  a  linear  differential 
equation. 

Although  the  expected  position  and  expected  momentum  do  not  (in  gen¬ 
eral)  exactly  follow  classical  trajectories,  they  will  do  so  approximately  un¬ 
der  certain  conditions.  If  the  wave  function  ij(x)  is  concentrated  mostly 
near  a  single  point  x  =  Xq,  then  (V'(X))^  and  V'((X)^)  will  both  be 
approximately  equal  to  V'(xq).  In  that  case,  the  expected  position  and 
expected  momentum  of  the  particle  will  approximately  follow  a  classical 
trajectory,  at  least  for  as  long  as  the  wave  function  remains  concentrated 
near  a  single  point. 


3.8  The  Heisenberg  Picture 


The  “Heisenberg  picture”  of  quantum  mechanics  is  based  on  Heisenberg’s 
matrix  model  of  quantum  mechanics  (Sect.  1.3).  In  the  Heisenberg  picture, 
one  thinks  of  the  operators  (quantum  observables)  as  evolving  in  time,  while 
the  vectors  in  the  Hilbert  space  (quantum  states)  remain  independent  of 
time.  This  is  to  be  contrasted  with  the  approach  to  quantum  mechanics 
we  have  been  using  up  to  now  (the  “Schrodinger  picture”),  in  which  the 
observables  are  independent  of  time  and  the  states  evolve  in  time. 


Definition  3.20  In  the  Heisenberg  picture ,  each  self-adjoint  operator  A 
evolves  in  time  according  to  the  operator- valued  differential  equation 


dA(t ) 
dt 


UA  ('>•"]' 


(3.39) 


where  H  is  the  Hamiltonian  operator  of  the  system,  and  where 
commutator,  given  by  [ A ,  B\  =  AB  —  BA. 


is  the 


/\  /\ 

Note  that  since  H  commutes  with  itself,  the  operator  H  remains  constant 

in  time,  even  in  the  Heisenberg  picture.  This  observation  is  the  quantum 
counterpart  to  the  fact  that  the  classical  Hamiltonian  H  remains  constant 
along  a  solution  of  Hamilton’s  equations. 
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Given  the  self-adjoint  operator  17,  the  spectral  theorem  will  give  us  a  way 
to  construct  a  family  of  unitary  operators  e~ltH/h1 t  E  R,  and  this  family  of 
operators  computes  the  time-evolution  of  states  in  the  Schrodinger  picture 
(Sect.  3.7.2).  It  is  easy  to  check  (at  least  formally)  that  the  solution  to 
(3.39)  can  be  expressed  as 

A(t)  =  eitH/nAe-itH/n  (3.40) 

Now,  if  0  is  the  state  of  the  system  (now  considered  to  be  independent  of 
time),  then  the  expectation  of  A(t)  in  the  state  0  is  defined  to  be  (. A(t ))^  = 
(0,  H(t)0)  .  We  may  then  compute  that 

(A(t))j,  =  U,eitk/hAe-itk/hip\ 


=  (ip(t),Aip(t)) , 


where  0(1)  is  time-evolved  state  of  the  system  in  the  Schrodinger  picture. 

Here,  we  have  used  that  the  adjoint  of  eltH^h  is  e~ltH/h ?  which  is  formally 
clear  and  which  is  a  consequence  of  the  spectral  theorem. 

Note  that  in  the  Schrodinger  picture,  (0(£),A0(£))  is  the  expectation 
value  of  A  in  the  state  0(1).  We  conclude,  then,  that  the  Heisenberg  picture 
and  the  Schrodinger  picture  give  rise  to  precisely  the  same  expectation 
values  for  observables  as  a  function  of  time,  and  are  therefore  physically 
equivalent.  Although  we  will  work  primarily  with  the  Schrodinger  picture  of 
quantum  mechanics,  the  Heisenberg  picture  is  also  important,  for  example, 
in  quantum  field  theory. 

A  „ 

Proposition  3.21  Suppose  H  =  Pz  /  (2m)  -t-H(A),  where  V  is  a  bounded- 
below  polynomial.  Then  for  any  t  E  R  we  have 

H  =  T{P{t))2 +  V{X{t)).  (3.41) 

/\  /V  /\ 

Note  that  since  [71,  H]  =  0,  the  Hamiltonian  H  is  independent  of  time, 
even  in  the  Heisenberg  picture.  Thus,  the  right-hand  side  of  (3.41)  is  ac¬ 
tually  independent  of  £,  even  though  P(t)  and  X(t)  depend  on  t.  Equa¬ 
tion  (3.41)  holds  also  for  sufficiently  nice  nonpolynomial  functions  V,  but 
some  limiting  argument  would  be  required  in  the  proof.  The  assumption 
that  V  be  bounded  below  is  to  ensure  that  H  is  actually  an  (essentially) 
self-adjoint  operator;  compare  Sect.  9.10. 

Lemma  3.22  Suppose  A  is  a  self-adjoint  operator  on  H  and  that  A(-)  is 
a  solution  to  (3.39)  with  A(0)  =  A.  Then  for  any  positive  integer  m,  the 
map 

t  ^  ( A(t))m 

is  also  a  solution  to  (3.39). 
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That  is  to  say,  the  time-evolution  of  the  mth  power  of  A  is  the  same  as 
the  mth  power  of  the  time-evolution  of  A;  that  is,  Arn{t)  =  (A(t))m. 
Proof.  If  we  use  (3.40),  then  the  result  holds  because 

git  A  /  h j^rri  —itH  /fi  git  A  /  h  —  /h  ^itH  /  h  —itH  /ft  .  git  A  /  H —it  H  /fi 

A  A  (  \  771 

git  A  /  hj^  -itH /h\ 

It  is  also  easy  to  check  that  A{t)rn  satisfies  the  differential  equation  (3.39). 

■ 

With  this  lemma  in  hand,  it  is  easy  to  prove  the  proposition. 

/\  /\ 

Proof  of  Proposition  3.21.  On  the  one  hand,  since  [H,H]  =  0,  the 

/\  /\ 

time-evolved  operator  H(t)  is  simply  equal  to  H.  On  the  other  hand,  if  we 
time-evolve  P2 /(2m)  +  V(X)  using  Lemma  3.22,  we  obtain  the  expression 
on  the  right-hand  side  of  (3.41).  ■ 

Proposition  3.23  Suppose  the  Hamiltonian  of  a  quantum  system  is  as 
in  Proposition  3.21.  Then  the  operators  X(t)  and  P{t)  defined  by  (3.39) 
satisfy  the  following  operator-valued  differential  equation: 

dX  _  1 

dt  m 
dP  _ 

dt 


pit ) 

P(X(t)).  (3.42) 


Proof.  See  Exercise  7.  ■ 

Proposition  3.23  means  that  the  operator- valued  functions  X(t)  and  P(t) 
satisfy  the  operator  analogs  of  the  classical  equations  of  motion  dx/dt  = 
p(t)/m  and  dp/dt  =  —Vr(x(t)).  Nevertheless,  the  expectation  values  of  X(t) 
and  P(t)  do  not  satisfy  the  ordinary  equations  of  motion,  as  we  have  already 
seen  by  calculating  in  the  Schrodinger  picture.  If  we  take  expectation  values 
in  the  system  (3.42),  we  get  the  same  answer  as  in  Proposition  3.19,  namely, 

i,  =  i 

jt  =  -  (mmi)* . 

These  are  not  the  classical  equations  of  motion,  unless  the  expectation  value 
of  the  operator  V\X(t))  coincides  with  V'  applied  to  the  expectation  value 
of  X(t),  which  is  usually  not  the  case. 


3.9  Example:  A  Particle  in  a  Box 

Let  us  consider  quantum  mechanics  in  one  space  dimension  for  a  particle 
that  is  confined  to  move  in  a  “box,”  which  we  describe  as  the  interval 
0  <  x  <  L.  Our  goal  is  to  find  all  of  the  eigenvectors  and  eigenvalues  of 
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the  Schrodinger  operator,  that  is,  to  find  solutions  of  the  time-independent 

/\ 

Schrodinger  equation  Hip  =  Eip.  In  solving  this  equation,  we  may  think  of 
the  constraint  to  the  box  as  follows.  Imagine  a  particle  moving  in  M1  in  the 
presence  of  a  potential  V  that  is  0  for  x  between  0  and  L  and  takes  some 
very  large  constant  value  C  on  the  rest  of  the  real  line.  Classically,  this 
would  mean  that  the  particle  has  to  have  very  high  energy  (greater  than 
C)  to  escape  from  the  box.  Quantum  mechanically,  if  we  have  a  solution 
of  the  time-independent  Schrodinger  equation  Hip  =  Eip  for  this  potential 
(with  E  <C  C),  then  we  expect  ip  to  decay  rapidly  for  x  outside  of  the  box. 
(We  will  see  this  behavior  explicitly  in  Chap.  5.)  In  the  limit  as  C  tends  to 
infinity,  we  expect  solutions  of  the  time-independent  Schrodinger  equation 
to  be  zero  outside  the  box  and  to  tend  to  zero  as  we  approach  the  ends  of 
the  box. 

The  upshot  of  this  discussion  is  that  we  are  looking  for  smooth  functions 
ip  on  [0,  L\  that  satisfy  the  differential  equation 

'll) 

-  ±-  V  =  Eip(x),  0  <x<L  (3.43) 

2m  dxz 

and  the  boundary  conditions 

ip(  0)  =  ip{L)  =  0.  (3.44) 

For  E  >  0,  the  solution  space  to  (3.43)  will  be  the  span  of  two  complex 
exponentials,  or  equivalently  a  sine  and  a  cosine  function: 


(3.45) 


.  ( V2mE 

ip[x)  =  asm  I  — — — 


x  -f  b  cos 


/  \/2mE 

{— 


x 


If  we  now  impose  the  boundary  condition  ip( 0)  =  0,  we  get  that  6  =  0, 
leaving  only  the  sine  term.  If  we  then  impose  the  condition  'ip(L)  =  0,  we 
will  obtain  a  =  0 — which  would  mean  that  ip  is  identically  zero — unless 


sin 


(3.46) 


Since  we  are  interested  in  solutions  to  (3.43)  where  ip  is  not  identically 
zero,  we  want  (3.46)  to  hold.  Thus,  the  argument  of  sine  function  must  be 
an  integer  multiple  of  7 r.  This  condition  imposes  a  restriction  on  the  value 
of  E ,  namely  that  E  should  be  of  the  form 

j2? r2ft2 
J  2  mL2 

for  some  positive  integer  j. 

It  is  a  simple  exercise  (Exercise  8)  to  verify  that  for  E  <  0,  the  only 
solution  to  (3.43)  satisfying  the  boundary  conditions  (3.44)  is  the  one  with 
ip  identically  zero. 


(3.47) 
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Proposition  3.24  The  following  functions 
satisfying  the  boundary  conditions  (3.44) : 


are  solutions  to  (3.43) 


i’i  a) 


j  —  1>  2, 3, 


and  the  corresponding  eigenvalues  Ej  are  given  by  (3.47).  The  functions 
i)j  form  an  orthonormal  basis  for  the  Hilbert  space  L2([0,L]). 

Proof.  We  have  already  verified  the  equation  and  eigenvalue  for  each  ifj. 
It  is  a  simple  computation  to  verify  that  the  i/q’s  are  orthonormal,  and  the 
elementary  theory  of  Fourier  series  (Fourier  sine  series,  in  this  case)  shows 
that  the  i/q’s  form  an  orthonormal  basis  for  L2([0,L]).  ■ 

The  Hamiltonian  operator  for  this  problem  (in  which  V  =  0  inside  the 
box)  is  given  by 

JT  I  _  ^ 

V  2m  dx 2  ' 

This  operator  is  an  unbounded  operator  and  is  not  defined  on  the  whole 
Hilbert  space  T2([0,  T]),  but  only  on  a  dense  subspace  Dom(T)  d  MM). 
The  domain  of  H  should  be  chosen  in  such  a  way  that  H  is  essentially  self- 
adjoint  and,  thus,  symmetric  (Sect.  3.2),  meaning  that 


(3.48) 


✓\ 

for  all  in  Dom(Ff).  For  (3.48)  to  hold,  <f>  and  if  must  satisfy  appro¬ 
priate  boundary  conditions,  which  will  allow  the  boundary  terms  in  the 
integration  by  parts  to  be  zero.  (See  Exercise  9.) 

Mathematically,  then,  it  is  necessary  to  impose  some  boundary  condi- 
tions  in  order  for  H  to  be  an  essentially  self-adjoint  operator.  The  particular 
choice  of  boundary  conditions  (3.44)  is  based  on  the  idea  of  approximating 
the  box  by  a  very  large  “confining”  potential  outside  the  box.  See  Chap.  9 
for  an  extensive  discussion  of  domain  issues  for  unbounded  operator. 


3.10  Quantum  Mechanics  for  a  Particle  in  Rn 


Up  to  this  point,  we  have  been  considering  a  quantum  particle  moving 
in  R1.  It  is  straightforward,  however,  to  generalize  to  a  quantum  particle 
moving  in  Mn.  The  Hilbert  space  for  a  particle  in  W1  is  L2(Mn),  rather  than 
L2(R).  Instead  of  single  position  operator,  we  have  n  such  operators,  given 

by 

Xjif(x)  =  Xjip(x ),  j  =  1, . . . ,  n. 

Similarly,  we  have  n  momentum  operators,  given  by 
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As  in  the  M1  case,  X3  does  not  commute  with  Pj  but  satisfies  [Xj,  Pj  = 
ihL  On  the  other  hand,  Xj  commutes  with  Xk  and  Pj  commutes  with  P&. 
Furthermore,  Xj  commutes  with  P ^  for  j  ^  k.  These  formulas  are  referred 
to  as  the  canonical  commutation  relations. 


Proposition  3.25  (Canonical  Commutation  Relations)  The  position 
and  momentum  operators  satisfy 


for  all  1  <  j,  k  <  n. 


(3.49) 


These  relations  are  the  quantum  counterparts  of  the  Poisson  bracket  rela¬ 
tions  among  the  position  and  momentum  functions  in  classical  mechanics. 
Specifically,  the  role  of  the  Poisson  bracket  in  Proposition  2.24  is  played  in 
Proposition  3.25  by  the  quantity  (1  /(ih))[-,  •]. 

If  the  classical  Hamiltonian  for  a  particle  in  Mn  is  of  the  usual  form 
(kinetic  energy  plus  potential  energy),  then  we  may  analogously  define  the 
Hamiltonian  operator  to  be  of  the  form 

n  p‘2 
3  = 1 

where  V (X)  denotes  the  result  of  applying  the  function  V  to  the  commuting 
family  of  operators  X  =  (Xi, . . . ,  Xn).  It  it  natural  to  identify  V (X)  with 
the  operator  of  multiplication  by  the  function  V(x).  In  that  case,  we  may 
write  H  more  explicitly  as 

Htp(x)  =  -t— Af|)(x)  +  V(x)ip(x), 

Zm 

where  A  is  the  Laplacian,  given  by 


d 2 
dx 2- 


We  refer  to  an  operator  of  the  form  (3.50)  as  a  Schrodinger  operator. 

We  may  also  introduce  angular  momentum  operators  defined  by  analogy 
to  the  classical  angular  momentum  functions. 


Definition  3.26  For  each  pair  (j,  k)  with  1  <  j,  k  <  n,  define  the  angular 

momentum  operator  Jjk  by  the  formula 
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/\ 

As  in  the  classical  case,  we  have  Jjk  =0  when  j  =  k.  When  j  ^  fc,  X3 

and  Pk  commute,  so  the  order  of  the  factors  in  the  definition  of  Jjk  is  not 
important.  Explicitly,  we  have 

Jjk  —  ^  (  %j  V  %k  77  )  • 

y  OX} c  oXj  J 

The  operator  in  parentheses  is  the  angular  derivative  ( d/30 )  in  the  (xj,Xk) 
plane. 

When  n  =  3,  it  is  customary  to  use  the  quantum  counterpart  of  the 
classical  angular  momentum  vector ,  namely, 

Ji  :=  X2P3  -  X3P2;  J2  :=  X3P±  -  XxP3]  J3  :=  XxP2  -  X2Px.  (3.51) 

When  n  =  3,  every  Jjk  with  j  ^  k  is  one  of  the  above  three  operators  or 
the  negative  thereof. 


3.11  Systems  of  Multiple  Particles 


Suppose  now  we  have  a  system  of  N  quantum  particles  moving  in  Mn.  If  the 
particles  are  all  of  different  types  (e.g.,  one  electron  and  one  proton),  then 
the  Hilbert  space  for  this  system  is  L  2(RnN).  That  is,  the  wave  function 
ip  of  the  system  is  a  function  of  variables  x1,  x2, . . . ,  x^,  with  each  yJ 
belonging  to  Mn.  If  we  normalize  pj  to  be  a  unit  vector  in  L  2(WlN)1  then 
^(x1,  x2, . . . ,  x^)!2  is  to  be  interpreted  as  the  joint  probability  distribution 
for  the  positions  of  the  N  particles. 

We  may  introduce  position  operators  XJk  (the  kih  component  of  the 
position  of  the  jth  particle)  and  momentum  operators  P3k  in  obvious  anal¬ 
ogy  to  the  definition  for  a  single  particle.  The  typical  Hamiltonian  operator 
for  such  a  system  is  then 


where  rrij  is  the  mass  of  the  jth  particle.  Here  A  j  means  the  Laplacian 
with  respect  to  the  variable  xJ  £  Mn,  with  the  other  variables  fixed. 

As  we  will  see  in  Chap.  19,  the  Hilbert  space  for  a  composite  system, 
made  up  of  various  subsystems,  is  typically  taken  to  be  the  (Hilbert)  tensor 
product  of  the  individual  Hilbert  spaces.  In  the  present  context,  we  may 
think  of  our  system  of  being  made  up  of  N  subsystems,  each  being  one  of  the 
individual  particles.  Fortunately,  there  is  a  natural  isomorphism  (Proposi¬ 
tion  19.12)  between  L2(RnN )  and  the  tensor  product  of  N  copies  of  Mn, 
so  that  the  approach  we  are  taking  here  is  consistent  with  the  general 
philosophy. 


3.12  Physics  Notation 
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If  the  particles  in  question  are  identical  (say,  all  electrons),  then  there 
is  an  additional  complication  to  the  description  of  the  Hilbert  space  for 
the  system.  In  standard  quantum  theory,  we  are  supposed  to  believe  that 
“identical  particles  are  indistinguishable.”  What  this  means  is  that  the  wave 
function  should  have  the  property  that  if  we  interchange,  say,  x1  with  x2, 
then  the  new  wave  function  should  represent  the  same  physical  state  as 
the  original  wave  function.  Recalling  that  two  unit  vectors  in  the  quantum 
Hilbert  space  represent  the  same  physical  state  if  and  only  if  they  differ  by 
a  constant  of  absolute  value  1,  this  means  we  should  have 


3(x2,x1,x3,. 


for  some  constant  u  with  \u\  =  1.  Applying  this  rule  twice  gives  that  if  is 
u2ip,  so  evidently  u  must  be  either  1  or  —1. 

Particles  in  quantum  mechanics  are  grouped  into  two  types,  according 
to  whether  the  constant  u  in  the  previous  paragraph  is  1  or  —1.  Particles 
with  u  =  1  are  called  bosons  and  particles  with  u  =  —  1  are  called  fermions. 
Whether  a  particle  is  a  boson  or  a  fermion  is  determined  by  the  spin  of  the 
particle,  a  concept  that  we  have  not  yet  introduced.  Nevertheless,  we  can 
say  that  particles  without  spin  are  bosons.  For  a  collection  of  N  identical 
spinless  particles  moving  in  M3,  the  proper  Hilbert  space  is  the  symmetric 
subspace  of  L2(R3N),  that  is,  the  space  of  functions  in  L2(R3N)  that  are 
invariant  under  arbitrary  permutations  of  the  variables.  We  will  have  more 
to  say  about  spin  and  systems  of  identical  particles  in  Chaps.  IT  and  19. 


3.12  Physics  Notation 


In  quantum  mechanics,  physicists  almost  invariably  use  the  Dirac  nota¬ 
tion  (or  bra-ket  notation )  introduced  by  Dirac  in  1939  [5].  This  notation 
is  made  up  of  Notations  3.27-3.29  below.  In  this  section,  we  explore  the 
Dirac  notation  along  with  a  few  other  notational  differences  between  the 
mathematics  and  physics  literature. 

Before  proceeding  it  is  important  to  point  out  that  when  using  Dirac 
notation,  it  is  essential  that  the  complex  conjugate  in  the  inner  product 
should  go  on  the  first  factor. 


Notation  3.27  A  vector  if  in  H  is  referred  to  as  a  ket  and  is  denoted 
if) .  A  continuous  linear  functional  on  H  is  called  a  bra.  For  any  <f  E  H, 
let  (<f |  denote  the  bra  given  by 

(31  (3)  =  (3,3)  • 


That  is  to  say,  (<f\  is  the  uinner  product  with  <f”  functional.  The  bracket 
(or  bra-ket)  of  two  vectors  3,3  e  H  is  the  result  of  applying  the  bra  (<f\  to 
the  ket  \if)  ,  namely  the  inner  product  of  the  <f  and  if,  denoted  (<f\if) . 
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If  A  is  an  operator  on  H  and  0  is  a  vector  in  H,  then  we  can  form 
the  linear  functional  (</> |  A,  i.e.,  the  linear  map  0  i— >>  (0|A0)  .  Physicists 
generally  write  an  expression  of  this  form  as 

(<j>\A\  ip) . 

This  notation  emphasizes  that  there  are  two  different  ways  of  thinking  of 
this  quantity.  We  may  think  of  (0|A|0)  either  as  the  linear  functional 
(0 |  A  applied  to  the  vector  |0) ,  or  as  the  linear  functional  (0|  applied  to 
the  vector  A  |0) . 

Notation  3.28  For  any  0  and  0  in  H,  the  expression  |0)(0|  denotes  the 
linear  operator  on  H  given  by 

(!</>)(  V’l)  (x)  =  \4>)ty\x)  =  (Xlx)  1 4>)  ■ 

That  is,  in  mathematics  notation,  |0)(0|  is  the  operator  sending  x  to  (0,  x)  0- 

The  operator  |0)(0|  associates  to  each  (ket)  vector  |x)  a  new  vector  in 
the  only  way  that  makes  notational  sense:  We  interpret  |0)(0||x)  as  the 
vector  |0)  multiplied  by  the  scalar  (0|x)  . 

Notation  3.29  Given  a  family  of  vectors  in  H  labeled  by,  say,  three  indices 
n ,  /,  and  m ,  rather  than  denoting  these  vectors  as  | 0n,z,m) ,  cl  physicist  will 
denote  them  simply  as  I n,l,m) 


This  notation  is  not  without  its  pitfalls.  If  we  have  two  different  sets 
of  vectors  labeled  by  the  same  set  of  indices,  a  mathematician  can  simply 
label  them  as  0n>/?m  and  0n^m,  but  the  physicist  has  a  problem. 

As  an  example  of  the  Dirac  notation,  suppose  that  an  operator  H  has 
an  orthonormal  basis  of  eigenvectors  0n.  A  physicist  would  express  the 
decomposition  of  a  general  vector  in  terms  of  this  basis  as 


1  =  Yi  inXni  > 

n 


(3.52) 


where  ipn  is  represented  simply  as  | n)  and  where  \n)(n\  is  (given  that  | n)  is 
a  unit  vector)  the  orthogonal  projection  onto  the  one-dimensional  subspace 
spanned  by  the  vector  |  n)  . 


Notation  3.30  In  the  physics  literature,  the  complex  conjugate  of  a  com¬ 
plex  number  z  is  denoted  as  z*,  rather  than  z,  as  in  the  mathematics  liter¬ 
ature.  What  a  mathematician  calls  the  adjoint  of  an  operator  and  denotes 
by  A* ,  a  physicist  calls  the  Hermitian  conjugate  of  A  and  denotes  by  Af . 
Physicists  refer  to  self-adjoint  operators  as  Hermitian. 


We  may  express  the  concept  of  an  adjoint  (or  Hermitian  conjugate)  of 
an  operator  using  Dirac  notation,  as  follows.  If  A  is  a  bounded  operator  on 
H,  then  A t  is  the  unique  bounded  operator  such  that 
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One  peculiarity  of  the  physics  literature  on  quantum  mechanics  is  a 
conspicuous  failure  of  most  articles  to  state  what  the  Hilbert  space  is. 
Rather  than  starting  by  defining  the  Hilbert  space  in  which  they  are  work¬ 
ing,  physicists  generally  start  by  writing  down  the  commutation  relations 
that  hold  among  various  operators  on  the  space.  Thus,  for  example,  a  physi¬ 
cist  might  begin  with  position  and  momentum  operators  X  and  P,  satis¬ 
fying  [X,  P]  =  ihl ,  without  ever  specifying  what  space  these  operators  are 
operating  on.  The  justification  for  this  omission  is,  presumably,  the  Stone- 
von  Neumann  theorem,  which  asserts  that  (provided  the  operators  satisfy 
the  expected  “exponentiated”  relations)  there  is,  up  to  unitary  equiva¬ 
lence,  only  one  Hilbert  space  with  operators  satisfying  these  relations  and 
on  which  the  operators  act  irreducibly.  (See  Chap.  14  for  a  precise  state¬ 
ment  of  the  result.)  It  is,  nevertheless,  disconcerting  for  a  mathematician  to 
encounter  an  entire  paper  full  of  computations  involving  certain  operators, 
without  any  specification  of  what  space  these  operators  are  operating  on, 
let  alone  how  the  operators  act  on  the  space. 

This  practice  among  physicists  represents  something  of  a  role  reversal. 
In  the  setting  of  linear  algebra,  for  example,  a  mathematician  might  say, 
“Let  V  be  a  n-dimensional  vector  space  over  R.”  If  a  physicist  says,  “Oh,  so 
it’s  Mn,”  the  mathematician  will  reply,  “No,  no,  you  don’t  have  to  choose  a 
basis.”  By  contrast,  in  quantum  mechanics,  it  is  the  physicist  who  does  not 
want  to  choose  a  particular  realization  of  the  space.  A  physicist  will  simply 
write  down  the  commutation  relations  between,  say,  X  and  P.  If  pressed, 
the  physicist  might  say  that  he  is  working  in  an  irreducible  representation 
of  those  relations.  If  a  mathematician  then  says,  “Oh,  so  it’s  L2(M),”  the 
physicist  will  reply,  “No,  no,  there  is  no  preferred  realization.” 

Notation  3.31  Given  an  irreducible  representation  of  the  canonical  com¬ 
mutation  relations ,  and  given  a  vector  if  in  the  corresponding  Hilbert  space, 
a  physicist  will  speak  of  the  position  wave  function  if(x),  defined  by 


if(x) 


(3.53) 


Here,  (x\  is  the  bra  associated  with  the  ket  \x)  ,  where  \x)  is  supposed  to  be 
an  eigenvector  for  the  position  operator  with  eigenvalue  x. 


See,  again,  Chap.  14  for  the  precise  notion  of  “irreducible  representa¬ 
tion  of  the  canonical  commutation  relations.”  One  may  similarly  define  the 
momentum  wave  function  by  taking  the  inner  product  of  if  with  the  eigen¬ 
vectors  of  the  momentum  operator,  which  are  also  non-normalizable.  See 
Sect.  6.6  for  details. 

A  mathematician  might  find  Notation  3.31  objectionable  on  the  grounds 
that  the  operator  X  does  not  actually  have  any  eigenvectors.  After  all, 
it  is  harmless,  in  view  of  the  Stone-von  Neumann  theorem,  to  work  in 
the  “Schrodinger  representation,”  in  which  our  Hilbert  space  is  L2(M)  and 
the  position  operator  X  is  just  multiplication  by  x.  Given  a  number  xq, 
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there  is  no  nonzero  element  Q  of  L2(M)  for  which  Xif  =  Xo'ip.  After  all, 
any  fj  satisfying  this  equation  would  have  to  be  supported  at  the  point 
x  =  £o,  in  which  case  Q  would  equal  zero  almost  everywhere  and  would  be 
the  zero  element  of  L2(M).  A  physicist,  on  the  other  hand,  would  say  that 
the  desired  eigenfunction  is  ij)(x)  =  5{x  —  xo),  where  S  is  the  Dirac  delta- 
“function.”  The  fact  that  S(x  —  x o)  is  not  actually  in  the  Hilbert  space 
L2(M)  does  not  concern  the  physicist;  it  is  simply  a  “non-normalizable 
state.”  The  mathematical  theory  of  such  non-normalizable  states  comes 
under  the  heading  “generalized  eigenvectors.”  See  Sect.  6.6  for  a  discussion 
of  this  issue  in  the  case  of  the  eigenvectors  of  the  momentum  operator. 

A  more  subtle  issue  regarding  the  “position  eigenvectors”  is  that  each 
eigenvector  is  unique  only  up  to  multiplication  by  a  constant.  If  one  wants 
the  momentum  operator  to  act  on  the  position  wave  function,  as  defined  by 
(3.53),  in  the  usual  way,  one  must  make  a  consistent  choice  of  normalization 
of  the  eigenvectors  of  the  position  operators.  Specifically,  one  should  choose 
the  constants  in  such  a  way  that  the  exponentiated  momentum  operator 
exp (iaP/h)  maps  \x)  to  \x  +  a) . 


3.13  Exercises 

1.  Suppose  that  <fi(t)  and  jj(t)  are  differentiable  functions  with  values  in 
a  Hilbert  space  H,  meaning  that  the  limit 

d<f  <j)(t  +  h)  -  <f(t) 

—  :=  Inn  — - j 1 

at  h—>o  h 

exists  in  the  norm  topology  of  H  for  each  £,  and  similarly  for  ij(t). 
Show  that 

2.  Suppose  A  and  B  are  operators  on  a  finite- dimensional  Hilbert  space 
and  suppose  that  AB  —  BA  =  cl  for  some  constant  c.  Show  that 
c  =  0. 

Note:  This  shows  that  the  commutation  relations  in  (3.8)  are  a  purely 
infinite-dimensional  phenomenon. 

3.  If  A  is  a  bounded  operator  on  a  Hilbert  space  H,  then  there  exists  a 
unique  bounded  operator  A*  on  H  satisfying  (f^Afj)  =  (A*0,  fj)  for 
all  <f  and  fj  in  H.  (Appendix  A. 4. 3.)  The  operator  A*  is  called  the 
adjoint  of  A,  and  A  is  called  self-adjoint  if  A*  =  A. 

(a)  Show  that  for  any  bounded  operator  A  and  constant  c  E  C,  we 
have  (cA)*  =  cA*,  where  c  is  the  complex  conjugate  of  c. 
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(b)  Show  that  if  A  and  B  are  self-adjoint,  then  the  operator 


1 

ih 


[A,  B] 


is  also  self-adjoint. 

4.  Verify  Proposition  3.19  using  Proposition  3.14.  Note  that  the  operator 
V'(X)  means  simply  the  operator  of  multiplication  by  the  function 
V'(x). 

5.  Suppose  that  ^  is  a  unit  vector  in  L2(M)  such  that  the  functions 
xip(x)  and  x2ij)(x)  also  belong  to  L2(M).  Show  that 

N%>(<*4)2- 


Hint :  Consider  the  integral 


/oo 

(x  —  a)2  |V;(^)|  dx, 

-oo 


where  a  =  (X) 


b 


6.  Consider  the  Hamiltonian  H  for  a  quantum  harmonic  oscillator,  given 

by 


H 


h2  d2  k 


H-  —%  , 


2 m  dx2  2 

where  k  is  the  spring  constant  of  the  oscillator.  Show  that  the  function 


^o(x)  =  exp 


V  km 
~12H 


■x‘ 


is  an  eigenvector  for  H  with  eigenvalue  hx / 2,  where  uj  :=  yjk/m  is 
the  classical  frequency  of  the  oscillator. 

Note:  We  will  explore  the  eigenvectors  and  eigenvalues  of  H  in  detail 
in  Chap.  11. 


7.  Prove  Proposition  3.23. 
Hint:  Show  that  [P(t),H 


([ P,H])(t )  and  [X(t),H 


8.  (a)  Find  the  general  solution  to  (3.43),  where  E  is  a  negative  real 

number.  Show  that  the  only  such  solution  that  satisfies  the 
boundary  conditions  (3.44)  is  identically  zero. 

(b)  Establish  the  same  result  as  in  Part  (a)  for  E  =  0. 
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9. 


Suppose  </>  and  ip  are  smooth  functions  on  [0,  L\  satisfying  the 
boundary  conditions  (3.44).  Using  integration  by  parts,  show 


that 


where  H  = 


—  (h2 /2m)  d2  / d  x 2  and  where 


(0,^) 


>L 


cj)(x)'ip(x)  dx. 


(b)  Show  that  the  result  of  Part  (a)  fails  if  and  pj  are  arbitrary 
smooth  functions  (not  satisfying  the  boundary  conditions). 

10.  Let  Ji,  J2,  and  J3  be  the  angular  momentum  operators  for  a  particle 
moving  in  M3.  Using  the  canonical  commutation  relations  (Proposi¬ 
tion  3.25),  show  that  these  operators  satisfy  the  commutation  rela¬ 
tions 


1 

ih 

This  is  the  quantum  mechanical  counterpart  to  Exercise  19  in  the 
previous  chapter. 


4 

The  Free  Schrodinger  Equation 


In  this  chapter,  we  consider  various  methods  of  solving  the  free  Schrodinger 
equation  in  one  space  dimension.  Here  “free”  means  that  there  is  no  force 
acting  on  the  particle,  so  that  we  may  take  the  potential  V  to  be  identically 
zero.  Thus,  the  free  Schrodinger  equation  is 

dip  ih  d2ip 
dt  2  m  dx 2  ’ 

subject  to  an  initial  condition  of  the  form 


ip(x,0)  =  ipo(x). 

We  will  identify  some  key  features  of  solutions  to  this  equation,  such  as  the 
“spread  of  the  wave  packet”  and  the  distinction  between  “phase  velocity” 
and  “group  velocity.”  In  particular,  the  notion  of  group  velocity  will  confirm 
our  expectation  that  a  particle  of  momentum  p  should  travel  with  velocity 
v  =  p/m. 

Before  attempting  to  solve  the  free  Schrodinger  equation,  let  us  make  a 
simple  observation  about  the  time  evolution  of  the  expected  values  of  the 
position  and  momentum.  If  we  apply  Proposition  3.19  in  the  case  that  V 
is  identically  equal  to  zero,  we  have 

0. 
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Thus,  the  expectation  value  of  P  is  independent  of  time,  which  then  means 
that  the  expectation  value  of  X  is  linear  in  time: 

<*W>  =  <*>*,  +  i  <n*„ 

F%(t)  =  (p)*i>o  ■ 

Thus,  the  free  Schrodinger  equation  is  one  of  the  special  cases  in  which 
the  expected  values  of  the  position  and  momentum  exactly  follow  classical 
trajectories  (and  those  classical  trajectories  are  very  simple  in  the  case 
V  =  0). 


4.1  Solution  by  Means  of  the  Fourier  Transform 


We  look  for  solutions  of  the  free  Schrodinger  equation  on  Ft1  of  the  form 


_  i(kx—uj(k)t) 


(4.2) 


where  k  is  the  frequency  in  space  and  u){k)  is  the  frequency  in  time,  which 
is  an  as-yet-undetermined  function  of  k.  (Of  course,  such  a  solution  is  not 
square- integr able  in  x  for  a  fixed  t,  but  we  will  find  our  way  back  to  square- 
integrable  solutions  eventually.)  Plugging  this  into  (4.1)  easily  gives  the 
formula  for  w  as  a  function  of  k : 


(4.3) 


A  formula  of  this  sort,  expressing  the  temporal  frequency  w  asa  function  of 
the  spatial  frequency  k  in  a  solution  of  some  partial  differential  equation, 
is  called  a  dispersion  relation. 

Observe  that  (4.2)  can  be  written  as 


^(t,  t)  =  exp 


Now,  replacing  a  function  f(x)  by  f(x  —  a)  has  the  effect  of  shifting  /  to 
the  right  by  a.  Thus,  the  time-evolution  has  the  effect  of  shifting  the  initial 
function  to  the  right  by  an  amount  equal  to  (c o(k)/k)t.  This  means  that 
the  function  ijj(x,t)  is  moving  to  the  right  with  speed  uj{k)/k.  This  speed, 
for  reasons  that  will  be  clearer  in  Sect.  4.3,  is  called  the  phase  velocity. 

The  phase  velocity,  then,  is  the  speed  at  which  a  pure  exponential  solution 
of  our  equation  (the  free  Schrodinger  equation)  propagates.  We  compute 
the  phase  velocity  as  u(k)/k  =  Hk/(2m).  Now,  we  have  said  that  a  wave 
function  of  the  form  elkx  represents  a  particle  with  momentum  p  =  Hk. 
We  thus  arrive  at  the  following  curious  conclusion. 


4.1  Solution  by  Means  of  the  Fourier  Transform 
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Proposition  4.1  The  phase  velocity  of  a  particle  with  momentum  p  =  Hk  is 

_  _  u(k)  hk  p 

phase  velocity  =  — - —  =  - =  - . 

k  2rn  2  m 

This  velocity  is  half  the  velocity  of  a  classical  particle  of  momentum  p. 

Proposition  4.1  might  make  us  think  that  our  basic  relation  p  =  Hk  is 
off  by  a  factor  of  2.  We  will  see,  however,  that  the  phase  velocity,  that  is, 
the  velocity  of  a  pure  exponential  solution,  is  not  the  “real”  velocity  of  a 
particle  with  momentum  p.  The  real  velocity  is  the  “group  velocity,”  which 
will  turn  out  to  be,  as  expected,  p/m. 

Leaving  aside  for  now  the  question  of  the  velocity,  let  us  build  up  a 
general  solution  to  (4.1)  from  solutions  of  the  form  (4.2).  We  make  use  of 
the  Fourier  transform,  discussed  in  Appendix  A. 3.  We  can  then  express  the 
solution  to  the  free  Schrodinger  equation,  for  “nice”  initial  conditions,  as  a 
“superposition”  of  these  pure  exponential  solutions. 

Proposition  4.2  Suppose  that  ipo  is  a  “nice”  function,  for  example,  a 
Schwartz  function  (Definition  A. 15).  Let  ipo  denote  the  Fourier  transform 
of  ip o  and  define  i)(x,f)  by 

1>(x,t)  =  -=  (k)ei(kx~u(k)t)  dk,  (4.5) 

V  2  7T  J  —  oo 

where  uj(k)  is  defined  by  (4-3).  Then  i)(x,f)  solves  the  free  Schrodinger 
equation  with  initial  condition  fjQ. 


The  assumption  that  be  a  Schwartz  function  is  stronger  than  neces¬ 
sary.  The  reader  is  invited  to  trace  through  the  argument  and  find  suitable 
weaker  conditions. 

Proof.  Since  the  Fourier  transform  of  a  Schwartz  function  is  a  Schwartz 
function,  ipo (k)  will  decay  faster  than  1/k4  as  k  tends  to  Too.  Meanwhile, 
by  integrating  the  derivative  of  the  function  elkx ,  we  obtain  the  estimate 


,ik(x-\-h)  _  g ikx 

h 


< 

k 

We  can  then  apply  dominated  convergence,  using  |fc|  (k)  as  our  domi¬ 
nating  function,  to  move  a  derivative  with  respect  to  x  under  the  integral 

sign  in  the  formula  for  i)(x,t).  This  derivative  pulls  down  a  factor  of  ik 

/\ 

inside  the  integral.  The  decay  of  ipo  allows  us  to  repeat  this  argument  to 
move  a  second  derivative  with  respect  inside  the  integral.  We  can  also  move 
a  derivative  with  respect  to  t  inside  the  integral,  by  a  similar  argument. 

Since  exp {i(kx  —  co(k)t)}  satisfies  the  Schrodinger  equation  for  each 
fixed  k,  differentiation  under  the  integral  shows  that  i)(x,t)  satisfies  the 
Schrodinger  equation  as  well.  The  Fourier  inversion  formula  shows  that 
pj(x,  0)  =  fipo(x).  m 
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4.  The  Free  Schrodinger  Equation 


Proposition  4.3  If  f>(x,t)  is  as  in  Proposition  1^.2,  then  the  Fourier 
transform  of  f>(x,t),  with  respect  to  x  with  t  fixed ,  is  given  by 


/\  /\ 

fi{k,  t)  =  exp 


—i 


Hk2t 
2  m 


(4.6) 


Proof.  We  can  write  (4.5)  as 


1 

V/2) r 


ipo(k)e 


dk. 


By  the  uniqueness  of  the  Fourier  decomposition  (i.e.,  the  injectivity  of  the 
inverse  Fourier  transform,  which  follows  from  the  Plancherel  formula),  the 
Fourier  transform  of  f>(x,t)  (with  respect  to  x)  must  be  the  function  in 
square  brackets.  Putting  in  the  expression  (4.3)  for  uo(k)  establishes  the 
desired  result.  ■ 

Now,  the  Fourier  transform  is  a  unitary  map  from  L  2(M)  onto  L2(M). 

Thus,  for  any  fio  in  L2(M),  fio  also  belongs  to  L2(R).  Since  the  quantity 

/\ 

multiplying  f>o(k)  in  (4.6)  has  absolute  value  1,  the  right-hand  side  of  (4.6) 
is  a  well-defined  square-integrable  function  of  k,  for  any  fio  in  L2(M),  which 
has  a  well-defined  inverse  Fourier  transform  in  L2(M). 

Definition  4.4  For  any  fio  E  L2(M),  define,  for  each  t  E  R,  fi(x,t)  to  be 
the  unique  element  o/L2(M)  that  has  a  Fourier  transform  (with  respect  to 
x)  given  by  (4-6). 

Definition  4.4  defines  a  time-evolution  for  arbitrary  initial  conditions 
in  L2(M).  For  general  fio  E  L2(M),  however,  'ip(xfi)  may  not  satisfy  the 
Schrodinger  equation  in  the  classical,  pointwise  sense,  simply  because  t) 
may  fail  to  be  differentiable,  either  in  x  or  in  t.  Nevertheless,  fi(x,t),  as 
defined  by  Definition  4.4,  always  satisfies  the  Schrodinger  equation  in  the 
weak  (distributional)  sense.  See  Exercise  1. 


4.2  Solution  as  a  Convolution 


According  to  Proposition  4.3,  we  see  that  the  Fourier  transform  of  the 
time-t  wave  function  is  the  product  of  the  Fourier  transform  of  fio  and 
the  function  exp[— ithk2 / {2m)\.  According  to  Proposition  A. 21,  the  inverse 
Fourier  transform  of  a  product  of  two  sufficiently  nice  functions  is  I/^/Fk 
times  the  convolution  of  the  two  separate  inverse  Fourier  transforms.  Here 
the  convolution  <p  *  of  two  functions  and  is  defined  to  be 


•oo 


(0*^)0=  /  4>(x  -y)i>{y)  dy, 


—  OO 


whenever  the  integral  is  convergent  for  all  x. 
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Formally,  then,  we  ought  to  have 

i P(x,t)  =ipo*  Kt, 


where 


Kt  = 


1 


y/2 


r 


exp 


7 r 


hk2t 
2  m 


The  problem  with  is  idea  is  that  the  function  exp[— ithk2  /  (2m)\  is  not 
a  “nice”  function  in  the  usual  sense.  Certainly,  this  function  is  not  the 
Fourier  transform  of  some  function  in  L1(M)  D  L2(M),  because  if  it  were, 
then  the  function  would  have  to  tend  to  zero  at  infinity  (Proposition  A.  14). 
Therefore,  we  cannot  directly  apply  Proposition  A.21,  even  if  ifo  is  in 
L1(M)  n  L2(M). 

Fortunately,  the  desired  inverse  Fourier  transform  can  be  computed  as  a 
convergent  improper  integral  (Exercise  2),  with  the  following  result: 


Kt{x)  := 


1 


*oo 


2n 


elkx  exp 


-oo 


—l 


hk2t 
2  m 


dk  = 


m 


i2nht 


exp 


rax 

2th 


(4.8) 


Here,  the  square  root  is  the  one  with  positive  real  part.  The  function  Kt 
is  called  the  fundamental  solution  of  the  free  Schrodinger  equation.  (See 
Fig.  4.1.)  This  function  does  indeed  satisfy  the  free  Schrodinger  equation, 
as  we  can  easily  verify  by  direct  differentiation. 

The  preceding  discussion  should  make  the  following  result  plausible. 

Theorem  4.5  Suppose  ifo  G  L  2(M)  n  L1(M).  Then  if(x,t),  as  defined  by 
(4-5),  may  be  computed  for  all  t  ^  0  as 

Wi’() = / Th  I_jxp{'^{x-yf}'My)  iy- 


The  expression  for  if  (x^t)  is  (2t t)  xl2Kt  *  ip o,  where  Kt  is  as  in  (4-8). 

Proof.  For  any  set  E  C  M,  let  1  e  denote  the  indicator  function  of  E,  that 
is,  the  function  that  is  1  on  E  and  0  elsewhere.  Then  Kt l[-n,n]  belongs  to 
L1(R)  D  L2(M)  for  any  positive  integer  n.  By  Proposition  A.21,  then,  we 
have 

T  ((Kt  1  *  Ipo)  =  (4-9) 

Because  if o  is  in  L1(M),  it  is  easy  to  see  that  Kt  l[_n,n]  *  converges 
pointwise  to  Kt*if  o-  On  the  other  hand,  using  the  argument  in  Exercise  2, 
we  can  see  that  T(Kt lr_n?n])  is  bounded  by  a  constant  independent  of  n 
and  converges  pointwise  to  the  function 


hk2t 
—i - 


exp 


2m 


(4.10) 
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R  e(Kt(x)) 


FIGURE  4.1.  The  real  part  of  Kt(x),  for  t  —  1  {top)  and  t  =  0.2  (bottom). 


4.3  Propagation  of  the  Wave  Packet:  First  Approach 
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Equation  (4.10)  is  enough  to  show  that  the  right-hand  side  of  (4.9) 
converges  in  L2(M)  to  the  function 


exp 


—i 


hk2t 
2  m 


Va  (k). 


By  the  Plancherel  theorem,  Kt  l[-n,n]  *Va  must  also  be  converging  in  L2(M), 
and  the  L2  limit  must  coincide  with  the  pointwise  limit,  which  is  Kt  *  Va« 
Thus,  taking  limits  on  both  sides  of  (4.9)  shows  that  the  Fourier  transform 
of  Kt  *  Va  is  what  we  want  it  to  be.  ■ 

In  general,  to  be  considered  the  fundamental  solution  of  a  certain  equa¬ 
tion,  a  function  should  converge  to  a  Dirac  ^-function  (Example  A. 26),  in 
the  distribution  sense,  as  t  tends  to  zero.  Since  \Kt{x)\  is  independent  of 
x  for  each  t,  it  might  seem  doubtful  that  Kt  has  this  property.  On  the 
other  hand,  we  can  see  Kt{x)  oscillates  very  rapidly  except  near  x  =  0. 
(See  Fig. 4.1.)  This  oscillation  causes  the  integral  of  Kt(x)  against  some 
nice  function  ^(x)  to  be  small,  except  for  the  part  of  the  integral  near 
x  =  0.  Indeed,  because  the  Fourier  transform  of  Kt  converges  to  the  con¬ 
stant  function  1  / \^2tt  (which  is  what  we  get  by  formally  taking  the  Fourier 
transform  of  the  ^-function)  as  t  tends  to  zero,  it  is  not  hard  to  show  that 
Kt  does,  in  fact,  converge  to  a  ^-function.  The  details  of  this  verification 
are  left  to  the  reader. 


4.3  Propagation  of  the  Wave  Packet:  First 
Approach 

Let  us  consider  the  Schrodinger  equation  in  M1  with  an  initial  condition 
Va  that  is  a  “wave  packet,”  meaning  a  complex  exponential  multiplied  by 
some  function  that  localizes  t/a  in  space.  Specifically,  we  take 

vpo(x)  =  eLpoX/n  Aq(x),  (4.11) 

where  Aq  is  some  real,  positive  function  and  po  is  a  nonzero  real  number. 
(The  case  po  =  0  should  be  treated  separately.)  We  also  assume  that  Aq  is 
“slowly  varying”  compared  to  eip°x/n,  meaning  that  Aq  is  approximately 
constant  over  many  periods  of  the  function  eipoX/n .  (We  will  give  a  more 
precise  meaning  to  the  “slowly  varying”  condition  shortly.)  Thus,  if  we  look 
at  'ipofa)  on  a  distance  scale  of  a  small  number  of  periods  of  the  function 
eip0x/n ,  then  fjo  will  look  like  a  constant  times  eipoX/n ,  which,  as  we  have 
seen,  represents  a  particle  with  momentum  p0-  We  expect,  then,  that  the 
wave  function  fjQ  represents  a  particle  with  momentum  approximately  equal 
to  Pq. 

Let  us  now  try  to  solve  the  free  Schrodinger  equation  in  terms  of  the 
amplitude  and  phase  of  the  wave  function.  We  write 

^(x,t)  =  A(x,  t)ez6>^4) 
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4.  The  Free  Schrodinger  Equation 


where  A  and  6  are  real- valued  functions.  If  we  plug  this  expression  for  ijj 
into  the  free  Schrodinger  equation  and  then  cancel  a  factor  of  e%e(yX^  from 
every  term,  we  obtain  the  equation 


dA  dO  ih  d2A  h  dA  dO  ih  ( dO 

_ l_  ^ — _ _ _ _ _ /  _ 

dt  dt  2m  dx 2  m  dx  dx  2m  \dx 

Since  A  and  6  are  real- valued,  we  may  separately  equate 
imaginary  parts  of  (4.12),  giving 


(4.12) 


the  real  and 


dA  h  dA  d9  h  ,  d29 

_  — _ _  _ q _ 

dt  m  dx  dx  2m  dx2 
and  (after  dividing  the  imaginary  part  of  (4.12)  by  A) 

de  _  h  1  d2 A  H  (d9\2 

dt  2m  A  dx2  2m  \dx ) 


(4.13) 


(4.14) 


Any  solution  to  this  system  of  partial  differential  equations  will  yield  a 
solution  'ip(xA)  =  A(x,  £)e20^4)  to  the  free  Schrodinger  equation. 

Since  we  are  assuming  A  is  “slowly  varying”  compared  to  0,  it  is  reason¬ 
able  to  think  that  the  first  term  on  the  right-hand  side  of  (4.14)  will  be 
small  compared  to  the  second  term.  That  is  to  say,  we  interpret  the  slowly 
varying  condition  to  mean 


1  d2A  / d9\ 2 
A  dx2  ^  \dx  J  ’ 


(4.15) 


where  the  symbol  means  “much  smaller  than.”  We  will  take  initial  con¬ 
ditions  such  that  (4.15)  holds  at  t  =  0,  and  then  we  will  assume  that  (4.15) 
continues  to  hold  at  least  for  small  positive  times.  We  may  then  (to  first 
approximation)  drop  the  first  term  on  the  right-hand  side  of  (4.14),  giving 
the  following  simplified  version  of  (4.14): 

d9  _  h  / d9\ 2 
dt  2m  \dx ) 


We  now  look  for  a  solution  to  the  pair  of  equations  (4.13)  and  (4.16) 
with  initial  conditions  corresponding  to  (4.11). 

Proposition  4.6  A  solution  to  the  approximate  equations  (4 A 3)  and 
(4-16)  with  initial  condition  0(x,  0)  =pox/h  is  given  by 


0(x,  t ) 


p°(x-  p^t] 

H  V  2m  / 


(4.17) 


x  — 


*t). 
m  J 


(4.18) 


and 


A(x,  t )  =  Aq 


4.3  Propagation  of  the  Wave  Packet:  First  Approach 
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This  yields  an  approximate  solution  to  the  free  Schro dinger  equation 
given  by 


ip(x,t)  =  A0 


exp 


(4.19) 


Note  from  (4.17)  and  (4.18)  that  if  the  “slowly  varying”  condition  (4.15) 
holds  at  time  0,  it  will  continue  to  hold  for  all  positive  times  in  our  approx¬ 
imate  solution. 

Proof.  Although  (4.16)  is  a  nonlinear  equation,  we  can  find  a  solution  to 
it  with  the  simple  initial  conditions  6{x,  0)  =  p^x/h ,  namely, 


6{x,  t ) 


Pox  _  Pp 

h  2rnh 

H  V  2  m  J 


(4.20) 


Since  dO/dx  =  p^/h  and  d20/dx2  =  0,  if  we  plug  (4.20)  back  into  (4.13) 
we  obtain 

dA  po  dA 

dt  m  dx 

The  (presumably  unique)  solution  to  this  linear  equation  with  initial  con¬ 
dition  A(x,  0)  =  Aq(x)  is 


A(x,t)  =  A0  (x  —  —t)  ,  (4-21) 

V  m  / 

as  claimed.  ■ 

We  hope  that  the  solution  (4.19)  to  the  system  of  equations  (4.13) 
and  (4.16)  is  a  close  approximation  to  the  solution  to  the  original  pair  of 
equations  (4.13)  and  (4.14) — assuming,  of  course,  that  Aq  is  slowly  varying 
compared  to  6o(x)  =  pox/h.  It  is  not  especially  easy  to  estimate  directly 
how  rapidly  solutions  to  (4.13)  and  (4.16)  diverge  from  solutions  to  (4.13) 
and  (4.14).  We  will  therefore  leave  an  estimate  of  the  error  in  our  approxi¬ 
mation  until  the  next  section,  where  we  will  obtain  the  same  approximate 
solution  by  a  different  method. 

Note  that  a  function  of  the  form  f(pc,  t)  =  <f(x  —  vt)  is  moving  to  the  right 
with  constant  velocity  v.  (If  v  is  negative,  then,  of  course,  this  means  the 
function  is  moving  to  the  left.)  Observe  that  both  the  amplitude  A(pc,  t)  and 
the  phase  exp {i0(x,t)}  are  of  this  form,  but  with  two  different  velocities. 

Conclusion  4.7  In  the  approximate  solution  (4  A  9)  to  the  free  Schro  dinger 
equation,  the  amplitude  A(x,t)  is  moving  with  velocity  po/m,  whereas  the 
phase  6(x,  t )  is  moving  with  velocity  po  /  (2m) .  These  two  velocities  are  called 
the  group  velocity  and  the  phase  velocity,  respectively: 

Po 

2m 
Po 


phase  velocity 
group  velocity 


m 
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Note  that  the  formula  for  the  phase  velocity  agrees  with  the  one  given 
previously  in  Sect.  4.1,  the  velocity  of  propagation  of  a  pure  exponential  so¬ 
lution  to  the  free  Schrodinger  equation.  Indeed,  nothing  prevents  us  from 
taking  Aq  =  1,  in  which  case  the  left-hand  side  of  (4.15)  is  actually  identi¬ 
cally  zero,  so  that  a  solution  to  (4.13)  and  (4.16)  is  actually  a  solution  to 
(4.13)  and  (4.14). 

Which  of  the  velocities  is  the  “real”  velocity  of  the  particle?  The  answer 
is:  the  group  velocity.  After  all,  the  probability  distribution  for  the  parti¬ 
cle’s  position  is  determined  by  the  amplitude  of  the  wave  function  and  is 
unaffected  by  the  phase.  It  is  the  amplitude  that  determines  (as  much  as  it 
can  be  determined)  where  the  particle  is.  Thus,  the  true  velocity  of  the  par¬ 
ticle  should  be  the  velocity  at  which  the  amplitude  propagates.  Figure  4.2 
shows  the  propagation  of  the  real  part  of  a  wave  packet,  with  the  motion 
of  a  single  peak  indicated  by  the  shaded  region.  The  phase  velocity  deter¬ 
mines  the  speed  at  which  the  individual  peaks  in  the  real  part  of  ip  move, 
whereas  the  group  velocity  determines  the  speed  of  the  packet  as  a  whole. 
Since  the  peak  we  are  tracking  lags  well  behind  the  motion  of  the  whole 
packet,  we  see  that  the  phase  velocity  is  smaller  than  the  group  velocity. 

We  should  expect  that  solutions  to  our  approximate  equations  (4.13) 
and  (4.16)  will  diverge  slowly  over  time  from  solutions  to  the  free 
Schrodinger  equation  (4.13)  and  (4.14).  For  sufficiently  long  times,  there 
may  be  a  significant  difference  between  approximate  and  true  solutions. 
This  expectation  is  confirmed  in  Sect.  4.5,  where  we  investigate  the  spread 
of  the  wave  packet,  a  phenomenon  that  is  not  seen  in  our  approximation. 


4.4  Propagation  of  the  Wave  Packet:  Second 
Approach 

We  have  seen  that  the  general  solution  of  the  free  Schrodinger  equation  can 
be  obtained  by  means  of  the  Fourier  transform  as 


l 


p2 


7 r 


•oo 


-oo 


where 


'ipo(k)  exp  [i  (kx  —  uj(k)t)\  dk , 

(4.22) 

,  s  hk2 

(4.23) 

<*>  =  2m  ■ 

Let  us  assume  that  has  approximate  momentum  equal  to  po-  Thus,  we 
expect  that  ipo(k)  will  be  concentrated  near  ko  :=  po/h.  If  that  is  the  case, 
then  only  the  values  of  k  close  to  ko  are  important.  For  k  close  to  /co,  we 
use  the  first-order  Taylor  expansion 

uj(k)  ~  uj(ko)  +uj'(ko)(k  —  &o), 

where  for  now  we  do  not  put  in  the  explicit  formula  for  uj'(ko). 


(4.24) 
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FIGURE  4.2.  Propagation  of  Ref^],  with  motion  of  a  single  peak  shaded. 
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Inserting  (4.24)  into  (4.22),  we  get  two  factors  that  are  independent  of  k 
and  come  outside  the  integral,  leaving  us  with 


1 


V2 


giu/  (ko)kot  —iuj(ko)t 


•oo 


7 T 


.  eiu' (ko)kot^— iu 


ifo(k)  exp  [ik(x  —  uf(ko)t)]  dk 
o(x  —  (jj'(ko)t).  (4.25) 


—  oo 


Note  that  the  factors  in  front  of  ^o(x  —  uo'(ko)f)  are  simply  constants, 
that  is,  independent  of  x.  These  constants  do  not  affect  the  “state”  of  the 
system,  in  that  we  have  said  that  two  vectors  in  the  quantum  Hilbert  space 
that  differ  by  a  constant  represent  the  same  physical  state.  Ignoring  these 
constants,  we  are  left  with  the  factor  of  i)o(x  —  u/(fco)£),  which  is  simply 
shifting  to  the  right  at  speed  u/(fco).  Thus,  the  (approximate)  velocity  at 
which  our  wave  packet  is  moving  is 


velocity  Lo'(ko) 


hkp  _  po 
m  m 


Let  us  consider  the  special  case  in  which  is  of  the  form 


-Ip0(x)  =  etkoXA0(x), 


where  Aq  is  real  and  positive.  Then  (4.25)  becomes 


gZu/  (k0)k0te 


iu{ko)teik0(x-u:'(ko)t)A^x  _  u'ikaY) 


After  canceling  the  terms  involving  uor (ko)kot  in  the  exponent,  we  obtain 


^(x,  t)  ~  edkox 


Recalling  that  po  =  hko  and  putting  in  the  formula  for  cj,  we  see  that  this 
approximation  to  ^(x,t)  is  precisely  the  same  as  the  one  we  obtained,  by 
a  different  method,  in  Proposition  4.6. 

As  in  Sect.  4.3,  we  see  that  the  velocity  at  which  a  pure  exponential 
solution  of  the  free  Schrodinger  equation  propagates  [namely,  cu(ko)/ko  = 
Hko /(2m)]  is  not  the  same  as  the  velocity  at  which  the  overall  wave  packet 
propagates.  Rather,  as  seen  in  (4.25),  the  wave  packet  propagates  at  a 
velocity  given  by  o/(fco)  =  hko/m.  We  may  summarize  this  conclusion  in 
the  following  proposition. 


Proposition  4.8  The  speed  at  which  a  pure  exponential  solution  of  the 
free  Schrodinger  equation  propagates  is 


phase  velocity  = 


cu(k0)  hko  Po 


ko  2m  2m 

By  contrast ,  the  (approximate)  speed  at  which  the  wave  packet  propagates  is 


group  velocity  = 


duo 

dk 


hk 


o 


k=ko 


m 


V  o 


m 
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The  disadvantage  of  the  method  we  used  in  Sect.  4.3  is  that  it  does  not 
easily  yield  estimates  on  how  big  an  error  there  is  in  our  approximation. 
In  the  current  section,  however,  we  can  estimate  the  error  by  comparing 
the  Fourier  transforms  of  the  exact  solution  and  the  approximate  solution. 
Our  error  estimate  will  involve  a  quantity  k,  defined  as  follows: 


k  = 


2  (k  -  k0)4  dk 


1/4 


(4.26) 


The  quantity  n  is,  roughly,  half  the  width  of  the  interval  around  fco  on 
which  most  of  'ip(k)  is  concentrated.  If,  for  example,  ip  is  supported  in  the 
interval  [fco  —  £,  ko  +  £],  then  n  <  s,  assuming  that  ip — and  therefore  ip 


is 


a  unit  vector.  (A  more  common  measure  of  concentration  would  replace 
(fc  —  ko)4  by  (k  —  ko)2  and  the  fourth  root  of  the  integral  by  the  square 
root.  But  the  “quartic”  measure  of  concentration  in  (4.26)  is  the  one  that 
arises  in  estimating  the  error  of  our  approximations  in  this  section.) 

Proposition  4.9  Let  ip(x,t)  be  the  exact  solution  to  the  free  Schrodinger 
equation  with  initial  condition  ipo,  and  let  (p(x,t)  be  the  approximate  solu¬ 
tion  given  by  the  right-hand  side  of  (4-25).  Then  the  following  L 2  estimate 
holds: 


4>{x,t)  -  <t>(x,t) ||i2| 


i 

'  hn2 

L2(M)  — 

-  =  t 

2m 

1 1  cj(tt), 


(4.27) 


where  the  L 2  norm  is  with  respect  to  x  with  t  fixed  and  where  cj(-)  is  defined 
by  (4.23). 

Equation  (4.27)  means  that  the  L 2  norm  of  the  error  will  be  small,  pro¬ 
vided  that 

1 

t  <  — 

UJ\K) 

If  k  is  much  smaller  than  fco,  then  l/uj(n)  will  be  much  larger  than  l/uj(ko). 
That  means  that  the  timescale  on  which  the  true  and  approximate  solutions 
diverge  will  be  long  compared  to  the  timescale  on  which  our  approximate 

solution  is  oscillating. 

/\  /\ 

Proof.  Let  'ip(k^t)  and  <p(k,t)  denote  the  Fourier  transforms  of  <p  and  ip 
with  respect  to  x,  with  t  fixed.  From  (4.22)  we  can  read  off  that 

pj(k,t)  =  e~lUJ^tipo{k). 

/\  /\ 

Meanwhile,  <p(k,t)  is  obtained  from  'ip(k^t)  by  replacing  uj(k)  by  the  right- 
hand  side  of  (4.24).  Now,  direct  calculation  shows  that 

=  2^-«2' 


iv{k)  -  (a; (fc0)  +  u/(fc0)(fc  -  fco)) 
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4.  The  Free  Schrodinger  Equation 


From  this  expression  and  the  elementary  estimate 
we  obtain 


VtM)  -  0(M) 


< 


(k  -  k0f 


V’o  (k) 


< 


0~<t>  |, 


(4.28) 


The  estimate  (4.27)  then  follows  by  the  Plancherel  theorem  and  the 
definition  of  n.  ■ 

For  a  more  detailed  version  of  the  approach  used  in  this  section,  see 
Sect.  5.6  of  [30]. 


4.5  Spread  of  the  Wave  Packet 

We  use  the  uncertainty  (Definition  3.13)  A^X  in  the  position  of  the  particle 
as  a  measure  of  the  “width”  of  ij)(x)  as  a  function  of  x.  At  the  level  of 
approximation  considered  in  the  previous  two  sections,  the  uncertainty  in 
the  position  of  a  free  particle  is  independent  of  time.  After  all,  in  the 
approximate  solution  (4.19),  the  amplitude  of  the  wave  function  simply 
shifts  to  the  right  at  a  speed  equal  to  the  group  velocity,  without  changing 
shape.  A  more  precise  calculation,  however,  shows  that  after  sufficiently 
long  times,  the  wave  packet  spreads  out  in  space.  (Exercise  7  gives  an  idea 
of  the  time  scale  on  which  this  spread  takes  place.) 

We  can  compute  the  time-evolution  of  the  uncertainty  in  the  particle’s 
position  without  having  to  solve  the  full  Schrodinger  equation,  by  using 
Proposition  3.14  from  Chap.  3.  We  start  by  observing  that  for  a  free  par¬ 
ticle,  our  Hamiltonian  is  simply  P2/(2m),  which  commutes  with  P.  It  fol¬ 
lows  that  the  expected  value  and  uncertainty  for  the  particle’s  momentum 
(and,  indeed,  the  entire  probability  distribution  of  the  momentum)  are  in¬ 
dependent  of  time.  Meanwhile,  to  compute  the  time-dependence  of  (X) 
and  (A2)  ,  we  use  Proposition  3.14  along  with  the  commutation  relation 
[A,  P]  =  ihl  (Proposition  3.8). 

Proposition  4.10  For  a  wave  function  fj{xp)  evolving  according  to  the 
free  Schrodinger  equation  onM1,  the  expectation  values  for  X  and  X2  evolve 
as  follows: 

t xU(t) =  ~  4%0 

and 

(x\m  =  < x\ ,  +  i  )XF  +  px)*,  +  4.  < p%m  ■ 

These  relations  imply  the  following  result: 

(x*<nxf 

=  k  (X*.pf  +  i  {(XP  +  PX),,  -  2  (X)„  (P)„)  +  (x„xf  ■ 


4.5  Spread  of  the  Wave  Packet 
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For  a  unit  vector  i/jq  in  L2(M),  the  uncertainty  A ^0P  in  the  momentum 
cannot  be  zero,  because  the  uncertainty  would  be  zero  only  if  t/’o  is  an 
eigenvector  for  the  momentum  operator.  But  the  eigenvectors  for  P  are 
the  functions  of  the  form  elkx ,  which  are  not  in  L2(M).  Thus,  the  leading 
coefficient  in  the  expression  for  (A^(t)X)2  is  never  zero,  and  thus  A^u\X 
tends  to  infinity  as  t  tends  to  infinity. 

Proof.  We  compute  that 

P2,  X]  =  P2X  -  PXP  +  PXP  -  XP2 
=  P[P,X]  +  [P,X]P 
=  —2  ihP. 


Thus  (as  we  have  already  noted  in  Sect.  3.7.5), 


i  <*>«•» = (y-mp))tm = ^ = x  <4-291 

where  we  have  used  in  the  last  equality  that  the  expected  momentum  is 
independent  of  time.  Since  the  derivative  of  (X)^(t)  is  constant,  (X)^^ 
itself  is  a  linear  function  of  £,  which  gives  the  first  result  in  the  proposition. 
Meanwhile,  a  little  algebra  shows  that 


P2,  V 


P  [P,  V]  X  +  [P,  V]  PX  +  XP  [P,  X]  +  X  [X,  P]  P 
-2  iH{PX  +  XP), 


and 


Thus 


P2,PX  +  XP]  = 

d  /j^2\  _  i 

dt  2 mh 


P  [P2,X]  +  [p- 


P2,X2 


bP) 


,x]p  =  -mp2. 

i<xp+ pxhi.) 


and 


i  1  1 


h  rn  2  m 
2  P< 


m* 


([ p 2 

bP) 


xp+px\)m 
-~^(p2) ,  . 

\  /  yjQ 


Since  the  second  derivative  of  (X2)^^  is  independent  oft,  (X2)^^  itself 

is  a  quadratic  polynomial  in  £,  the  coefficients  of  which  are  determined  by 
the  value  of  (X)^^  and  its  first  two  time-derivatives  at  t  =  0.  This  leads 
to  the  second  result  in  the  proposition.  The  last  result  follows  by  direct 
calculation.  ■ 
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4.  The  Free  Schrodinger  Equation 


4.6  Exercises 


1.  A  locally  integrable  function  satisfies  the  free  Schrodinger 

equation  in  the  weak  (or  distributional)  sense  if  for  each  smooth  com¬ 
pactly  supported  function  y,  we  have 


dx  ih  d2x 

dt  2  m  dx 2 


dx  dt  =  0. 


(4.30) 


[One  obtains  (4.30)  by  assuming  di/j/dt  —  (ih/2m)d2Tp  /  dx2  is  zero, 
integrating  against  y(x,£),  and  then  formally  integrating  by  parts.] 


(a)  Show  that  if  ^(x,t)  is  smooth  as  a  function  of  x  and  t  then  ^ 
satisfies  the  free  Schrodinger  equation  in  the  pointwise  sense  if 
and  only  if  satisfies  the  free  Schrodinger  equation  in  the  weak 
sense. 

Hint :  Proposition  A. 23  may  be  useful. 

(b)  For  any  ipo  E  L2(M),  define  'ip(xH)  by  Definition  4.4.  Show  that 
i/j  satisfies  the  free  Schrodinger  equation  in  the  weak  sense. 

First  show  that  the  function  ipA  given  by 

1  fA 

il>A(x,t)  =  —j=  /  VhOei(fcr_“(fe)t)  dk 
v27T  J-A 

satisfies  the  free  Schrodinger  equation  in  the  weak  sense,  for  each  A. 


2.  (a)  Show  that  for  any  a  E  C  with  Re(a)  >  0, 

e~x  2a ^  dx\  =  [  e~^x  ^A2a)  dx  dy 

)  Jr 2 

=  27ra, 

where  the  integral  over  M2  can  be  evaluated  using  polar  coordi¬ 
nates.  Conclude  that 


x2 /(2a)  dx  =  Aina, 


(4.31) 


where  the  square  root  is  the  one  with  positive  real  part, 
(b)  Show  that  for  all  A,  B  >  0  we  have 


x2 /(2a)  dx 


x2, 1  (2a) 


x2 /(2a)  dx 


for  any  nonzero  complex  number  a.  Using  this,  show  that  the 
integral  in  (4.31)  is  convergent  for  all  nonzero  a  with  Rea  >  0, 
provided  the  integral  is  interpreted  as  an  improper  integral  (i.e., 
the  limit  as  A  tends  to  infinity  of  an  integral  from  —A  to  A). 


4.6  Exercises 
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(c)  Now  show  that  the  result  of  Part  (a)  is  valid  also  for  nonzero 
values  of  a  with  Rea  =  0. 

Hint :  Given  /3  ^  0,  show  that  the  (improper)  integral  from  A 
to  oo  of  exp[— x2 / {2{a  +  i/3))\  is  small  for  large  A ,  uniformly  in 
c 1  £  [0,1]. 

(d)  Show  that 


1  f°° 

J-oc 

where  the  integral  is  interpreted  as  an  improper  integral  and  the 
square  root  is  the  one  with  positive  real  part. 


^ikx  ^  —  ithk2  /  (2m) 


^  ^imx2  /  {2th) 


2niHt 


3.  Suppose  0  is  a  Schwartz  function  (Definition  A.  15)  and  pj  belongs  to 
L2(M).  Show  that  the  convolution  <p  *  p)  is  smooth  (infinitely  differ¬ 
entiable). 

4.  Consider  the  heat  equation  for  a  function  ^(oyt),  given  by 

dp)  d2p) 

dt  dx2  ’ 

where  a  is  a  constant,  subject  to  the  initial  condition  p)(x,  0)  =  p)o(x). 


(a)  Derive  a  differential  equation  for  p)(k,t),  the  Fourier  transform 
of  a  solution  of  the  heat  equation  with  respect  to  ay  with  t  fixed, 
assuming  that  ip(x,t)  is  a  “nice”  function  of  x  for  each  t.  Solve 

/V  /\ 

this  equation  subject  to  the  initial  condition  p)(k,0)  =  p)o (k). 

(b)  Obtain  an  expression  for  the  solution  to  the  heat  equation  as 
a  convolution  of  ipo  with  a  “fundamental  solution”  to  the  heat 
equation. 


Note:  As  we  will  discuss  in  Chap.  20,  the  heat  equation  can  be  thought 
of  as  a  sort  of  “imaginary  time”  version  of  the  free  Schrodinger 
equation. 


5. 


Suppose  we  take  an  initial  condition  in  the  free  Schrodinger  equation 
with  initial  phase  given  by  6q{x)  =  pox/h  and  initial  amplitude  given 
by  A0  (x),  as  in  (4.11).  Suppose  also  that  the  initial  amplitude  is  of 
the  form 


A0(x)  =  exp  {  -1 


Note  that  Aq  is  centered  around  the  point  xq  and  that  the  parameter 
L  is  a  measure  of  the  “width”  in  space  of  our  initial  wave  packet. 
A  function  of  the  form  7po(x)  =  eip°x/h Aq(x),  with  Aq  as  above,  is 
called  a  Gaussian  wave  packet. 
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4.  The  Free  Schrodinger  Equation 


Compute  the  quantity 


1  f  1  d2Ap 

(de o\2  dx 2 

V  dx  ) 


(4.32) 


Assuming  that  h  is  small  compared  to  Lpo,  show  that  (4.32)  is  small, 
except  at  points  where  our  initial  wave  packet  is  very  small. 

Note:  This  shows  that  our  “slowly  varying”  assumption  (4.15)  is  rea¬ 
sonable  for  the  case  of  Gaussian  wave  packets. 


6.  The  Klein-Gordon  equation ,  a  proposed  relativistic  alternative  to  the 
Schrodinger  equation,  is  the  equation 


1  d2rip  d2^  m2c 2 

c 2  dt 2  dx 2  h2  ’ 


where  m  >  0  is  the  mass  of  the  particle  and  c  is  the  speed  of  light. 


(a)  Obtain  the  dispersion  relation  for  the  Klein-Gordon  equation, 
that  is,  the  expression  for  u(k)  that  makes  the  function  exp [i(kx— 
uj(k)t\  a  solution  to  the  Klein-Gordon  equation. 

(b)  Show  that  the  phase  velocity  co(k)/k  satisfies  \u(k)/k\  >  c,  that 
the  group  velocity  duj{k)/dk  satisfies  \duj / dk\  <  c,  and  that 

(phase  velocity)  (group  velocity)  =  c2. 


Note :  Since  the  Klein-Gordon  equation  is  second  order  in  time,  there 
will  be  two  possible  values  for  uj(k)  for  each  fc,  one  positive  and  one 
negative.  The  results  of  Part  (b)  hold  for  both  of  the  two  “branches” 
of  u(k). 


7.  Consider  the  uncertainty  of  a  wave  function  2p(t)  evolving 

according  to  the  free  Schrodinger  equation.  Show  that 


dt  N(t)X) 


< 


m 


(4.33) 


for  all  t  and  that 


lim  d 

t — >-+oo  dt 


(App)  AT) 


m 


Note :  By  comparison, 

i  ^  T-  <4'34» 

If  (k)  is  concentrated  in  a  sufficiently  small  region  around  a  nonzero 
number  ko  =  po/h,  then  A ^0P  will  be  small  compared  to  {P)^0  •  In 
that  case,  by  comparing  (4.33)  to  (4.34),  we  see  that  the  rate  at  which 
the  wave  packet  spreads  out  is  small  compared  to  the  rate  at  which 
the  wave  packet  moves. 


5 

A  Particle  in  a  Square  Well 


5.1  The  Time-Independent  Schrodinger  Equation 

It  is  difficult  to  solve  the  time-dependent  Schrodinger  equation  explicitly, 
even  in  relatively  simple  cases.  (Even  for  the  free  Schrodinger  equation, 
we  made  do  in  Chap.  4  with  solutions  that  are  either  approximate  or  that 
involve  an  integral  that  is  not  explicitly  evaluated.)  Usually,  then,  one  ana¬ 
lyzes  the  time-independent  Schrodinger  equation  (the  eigenvector  equation 
for  H )  and  then  attempts  to  infer  something  about  the  time-dependent 
problem  from  the  results.  There  are  a  number  of  problems,  including  the 
harmonic  oscillator  and  the  hydrogen  atom,  in  which  the  time-independent 
Schrodinger  equation  can  be  solved  explicitly. 

In  this  section,  we  will  consider  a  simple  but  instructive  example,  which 
can  be  solved  by  elementary  methods.  We  consider  the  time-independent 
Schrodinger  equation  in  M1 ,  with  a  potential  of  the  form 


V(x) 


x  <  A 


(5.1) 


where  A  and  C  are  positive  constants.  The  region  —A  <  x  <  A  is  the 
“square  well”  for  the  potential  (Fig.  5.1). 

Let  us  think  first  for  a  moment  about  the  behavior  of  a  classical  particle 
in  a  square  well.  If  we  think  of  V  as  the  limit  of  a  sequence  of  potentials 
that  change  linearly  from  —1  to  0  in  a  small  interval  around  ±1,  we  may 
expect  the  following  behavior  for  a  particle  in  a  square  well.  If  the  energy 
of  the  particle  is  negative,  then  the  particle  must  be  in  the  well.  In  that 
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5.  A  Particle  in  a  Square  Well 
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FIGURE  5.1.  A  square  well  potential. 


case,  it  will  move  with  constant  speed  until  it  hits  the  edge  of  the  well, 
at  which  point  it  will  reflect  instantaneously  off  the  wall  and  move  with 
the  same  speed  in  the  opposite  direction.  If  the  energy  of  the  particle  is 
positive,  it  will  move  always  in  the  same  direction,  with  speed  equal  to  one 
constant  when  it  is  not  in  the  well  and  speed  equal  to  a  different  constant 
when  it  is  in  the  well. 

In  the  quantum  case,  we  will  be  interested  mainly  in  eigenvectors  for  the 
Schrodinger  operator  with  negative  eigenvalues  (E  <  0).  Of  course,  on  the 
quantum  side  of  things,  energy  eigenvectors  do  not  change  in  time,  except 
for  an  overall  phase  factor.  Nevertheless,  since  the  classical  particle  with 
E  <  0  spends  the  same  amount  of  time  in  each  part  of  the  well,  we  may 
expect  that  the  quantum  particle  will  have  approximately  equal  probability 
of  being  found  in  each  part  of  the  well.  This  expectation  will  be  fulfilled 
for  “highly  excited  states,”  such  as  the  one  in  Fig.  5.7.  For  the  quantum 
particle,  however,  there  is  a  small  but  nonzero  probability  of  finding  the 
particle  outside  the  well,  which  is  impossible  classically. 

Our  goal  is  to  study  the  time-independent  Schrodinger  equation,  that  is, 
the  eigenvalue  equation 


(5'2^ 

where  both  the  eigenvalues  E  and  the  associated  eigenvectors  ip  (or  “eigen¬ 
functions,”  in  physics  terminology)  are  as  yet  unknown.  As  a  second-order 
linear  ordinary  differential  equation,  this  equation  always  has  (for  any  value 
of  E)  a  two-dimensional  solution  space.  We  are,  however,  looking  for  solu¬ 
tions  that  lie  in  the  quantum  Hilbert  space  L2(M).  We  will  see  there  are 
actually  only  a  finitely  many  E”s,  all  of  them  with  E  <  0,  for  which  (5.2) 
has  a  nonzero  solution  in  L2(M).  In  this  case,  then,  the  Schrodinger  op¬ 
erator  H  has  a  discrete  spectrum  below  zero  and  a  continuous  spectrum 
above  zero. 


5.2  Domain  Questions  and  the  Matching  Conditions 
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5.2  Domain  Questions  and  the  Matching 
Conditions 


Before  starting  to  solve  (5.2),  we  must  give  some  heed  to  the  unbounded 
nature  of  the  Hamiltonian  operator.  The  Schrodinger  operator 


ft 2  d 2 
2m  dx2 


+  V(X) 


on  the  left-hand  side  of  (5.2)  is  an  unbounded  operator,  meaning  that  there 

A 

is  no  constant  C  such  that  ||170||  <  C  ||0||,  where  ||*||  is  the  L 2  norm.  On 


/\ 

the  other  hand,  we  want  to  define  H  in  such  a  way  that  it  is  self-adjoint. 
But  according  to  Corollary  9.9,  a  self-adjoint  operator  that  is  defined  on 

the  whole  Hilbert  space  must  be  bounded. 

/\ 

We  conclude,  then,  that  H  is  not  going  to  be  defined  on  the  entire  Hilbert 
space  L2(M),  but  only  on  a  dense  subspace  thereof.  In  practical  terms, 

A 

saying  that  H  is  not  defined  on  the  whole  Hilbert  space  means  simply  that 
for  many  functions  0  in  L2(M),  the  second  derivative  d2ip/dx2  does  not 
exist,  or  exists  but  fails  to  be  in  L2 .  (In  our  example,  the  potential  V  is 
bounded,  and  so  M0  will  always  be  in  L 2  provided  that  0  is  in  L2 .) 

Since  the  potential  V  for  a  square  well  is  bounded,  the  domain  of  the 
Hamiltonian  H  =  P2 / (2m)  +  V(X)  is  the  same  as  the  domain  of  the 
kinetic  energy  operator  P2 / (2m)  =  —  (h2/2m)d2/dx2.  As  we  will  see  in 
Sect.  9.7,  the  domain  of  the  kinetic  energy  operator  may  be  described  as 
the  space  of  L 2  functions  0  for  which  d2ip/dx2,  computed  in  the  weak 
or  distributional  sense  (Appendix  A. 3. 3),  again  belongs  to  L2(R).  This 
condition  is  equivalent  to  the  statement  that  there  exists  some  L 2  function 
0  such  that  0  is  the  second  integral  of  0  (for  some  choice  of  the  constants 
of  integration). 

Meanwhile,  since  our  potential  is  piecewise  constant,  any  solution  0 
to  (5.2)  will  be  smooth  except  possibly  at  the  transition  points  x  =  PA, 
and  both  0  and  0'  will  have  left  and  right  limits  at  A  and  —A.  Indeed,  on 
each  of  the  intervals  (— oo,  —  A),  (—A,  A),  and  (A,  oo),  any  solution  to  (5.2) 
will  be  simply  a  linear  combination  of  (real  or  complex)  exponentials.  For 

A 

functions  of  this  sort,  it  is  not  hard  to  see  when  we  are  in  the  domain  of  H. 


Proposition  5.1  Suppose  0  is  smooth  on  each  of  the  intervals  (— oo,  —A), 

A 

(—A,  A),  and  (A,  oo).  Then  0  belongs  to  the  domain  of  H  [with  potential 
function  given  by  (5.1)]  if  and  only  if  the  (1)  0  and  dip / dx  are  continuous 
at  x  =  PA,  and  (2)  d2ip/dx2  belongs  to  L2(R). 

Proof.  Suppose  first  that  0  satisfies  the  conditions  (1)  and  (2).  Then  it  is 
not  hard  to  see  (Exercise  1)  that  the  second  derivative  of  0  in  the  distribu¬ 
tion  sense  is  simply  the  function  d2'p/dx2 ,  computed  in  the  ordinary  point- 
wise  sense  for  x  ^  PA.  (The  second  derivative  may  not  exist  at  x  =  PA , 
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but  we  simply  leave  d2ip/dx2  undefined  at  these  two  points,  which  form  a 
set  of  measure  zero.)  Thus,  d2ip/dx2,  computed  in  the  distribution  sense, 
is  an  element  of  L2(R). 

On  the  other  hand,  if  either  ip  of  ip'  has  a  discontinuity  at  x  =  A  or  at 
x  =  —  A,  then  (Exercise  1  again)  the  distributional  derivative  will  contain 
either  a  multiple  of  a  ^-function  of  a  multiple  of  the  derivative  of  ^-function 
at  one  of  these  points.  But  neither  a  ^-function  nor  the  derivative  of  5- 
function  is  a  square-integrable  function.  ■ 

Let  us  think  about  what  the  continuity  condition  on  ip  and  dip / dx  means 
in  practical  terms.  Since  V  is  constant  on  (— oo,  —  A),  we  can  easily  solve 
(5.2)  on  that  interval,  obtaining  a  two-dimensional  solution  space.  Once  we 
choose  a  solution  from  this  solution  space,  then  the  values  of  ip  and  dip / dx 
as  x  approaches  —A  from  the  left  will  serve  as  the  initial  conditions  for  solv¬ 
ing  (5.2)  on  (—A,  A).  Thus,  the  requirement  of  continuity  for  ip  and  dip / dx 
serve  as  a  “matching  condition”  between  the  solution  on  (— oo,  —A)  and  the 
solution  on  (—A,  A).  We  cannot  just  separately  pick  any  solution  to  (5.2) 
on  (— oo,  —A)  and  any  solution  on  (—A,  A);  at  the  boundary,  the  values  of 
ip  and  dip/dx  must  match.  (This  same  matching  condition  appears  in  el¬ 
ementary  treatments  of  ordinary  differential  equations  with  discontinuous 
coefficients.) 

Once  we  pick  a  solution  on  (— oo,  —  A)  we  get  a  unique  solution  on 
(—A,  A) — and  then  the  values  of  ip  and  dip/dx  as  we  approach  A  from 
the  left  will  serve  as  the  initial  conditions  for  solving  (5.2)  on  (A,  oo).  The 
conclusion  is  that  once  we  pick  a  solution  to  (5.2)  on  (— oo,  —  A)  (from  the 
two-dimensional  solution  space),  we  have  no  additional  choices  to  make; 
the  differential  equation  along  with  the  matching  conditions  give  a  unique 
way  to  extend  the  solution  from  (— oo,  —A)  to  the  whole  real  line. 


5.3  Finding  Square-integrable  Solutions 

If  E  >  0,  then  any  solution  to  (5.2)  will  be  a  combination  of  two  complex 
exponentials  in  the  range  x  <  —  A;  such  a  function  cannot  be  square- 
integrable  unless  it  is  identically  zero.  If,  however,  we  take  ip  to  be  iden¬ 
tically  zero  in  the  region  x  <  —A,  then  our  continuity  condition  requires 
that  ip  and  dip/dx  approach  0  as  x  approaches  —A  from  the  right.  Thus, 
the  matching  conditions  at  —A  force  the  solution  to  be  identically  zero  in 
[~A,A]  as  well.  Finally,  by  matching  across  x  =  A,  we  get  an  identically 
zero  solution  on  [A,  oo).  Thus,  for  E  >  0,  any  solution  to  (5.2)  satisfy¬ 
ing  the  continuity  conditions  in  Proposition  5.1  must  be  identically  zero. 
A  similar  analysis  applies  when  E  —  0,  where  the  solutions  to  (5.2)  on 
(— oo,  A]  would  be  of  the  form  c\  +  c2£,  which  is  square-integrable  only  if 
ci  =  c2  =  0. 
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The  conclusion,  then,  is  that  to  have  a  chance  to  get  a  solution  to  (5.2) 
that  is  square- integr able  and  in  the  domain  of  i7,  we  must  take  E  <  0.  For 
E  <  0,  the  solution  to  (5.2)  on  (— oo,  —A)  will  be  a  linear  combination  of 
the  two  exponentials  exp(ax)  and  exp(— ax),  where 


a  = 


^/2rn\E 


n 


(5.3) 


For  pj  to  be  square-integrable  over  (— oo,  —  A),  the  coefficient  of  exp  (—ax) 
must  be  zero,  since  this  term  grows  exponentially  as  x  tends  to  —  oo.  Thus, 
the  value  of  pj  on  (— oo,  —A)  must  be  cexp(ax).  Once  we  choose  a  value 
for  c,  we  get  a  unique  solution  on  (—A,  A)  by  matching  ip  and  pj'  across 
x  =  —A.  We  then  get  a  unique  solution  on  (A,  oo)  by  matching  across 
x  =  A.  The  solution  on  (A,  oo)  will  be  again  be  a  linear  combination 
of  exp  (ax)  and  exp  (—ax).  For  pj  to  be  in  L2,  we  need  the  coefficient  of 
exp(ax)  on  (A,  oo)  to  be  zero.  We  have  no  choice,  however,  about  what  ip 
is  on  (A,  oo);  the  coefficient  of  exp(ax)  either  comes  out  to  be  zero  or  it 
does  not. 

The  conclusion,  then,  is  that  for  any  E  <  o,  there  is  a  unique  (up  to  a  con¬ 
stant)  solution  to  (5.2)  that  is  square-integrable  on  the  interval  (— oo,  —A). 
This  solution  then  gives  rise  to  a  unique  solution  on  (—A,  A)  and  then  to  a 
unique  solution  on  (A,  oo),  up  to  a  constant.  Unless  we  are  lucky,  the  solu¬ 
tion  on  (A,  oo)  will  grow  exponentially  and  thus  fail  to  be  in  L2 .  Therefore, 
in  most  cases  there  will  be  no  nonzero  solution  to  (5.2)  that  satisfies  the 
continuity  condition  and  is  square-integrable  over  the  whole  real  line.  The 
hope  is  that  for  certain  special  values  of  E ,  we  will  be  able  to  find  a  solu¬ 
tion  that  decays  exponentially  both  on  (— oo,  —  A)  and  on  (A,  oo),  in  which 
case  the  solution  will  belong  to  L2(M). 

It  can  be  shown  (Exercise  6)  that  there  are  no  nonzero  square-integrable 
solutions  with  E  <  —C.  Therefore,  any  square-integrable  solutions  to  (5.2) 
that  may  exist  must  come  from  the  range  —  C  <  E  <  0.  To  analyze  this 
range,  let  us  rewrite  the  time-independent  Schrodinger  equation  by  dividing 
through  by  —ft2 /(2m),  yielding  the  equation 


where 


d2pj 

dx2 


>  A 


—  (c  —  e)p> 


x 


<  A 


2m  E 


£  =  — 


C 


h2 
2  mC 


(5.5) 


Note  that  although  E  is  assumed  to  be  negative,  we  have  normalized  e  to 
be  positive;  the  condition  —  C  <  E  <  0  corresponds  to  0  <  £  <  c. 
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Because  our  potential  function  V  is  even,  it  is  easy  to  see  that  for  any 
solution  ip  to  (5.4),  the  even  and  odd  parts  of  ip  are  also  solutions.  We  can, 
therefore,  analyze  even  solutions  and  odd  solutions  separately.  We  begin 
with  the  even  case.  For  x  <  — A ,  every  solution  to  (5.4)  that  is  square- 
integrable  over  (— oo,  A)  is  of  the  form 

'ip(x)  =  ae^x ,  x  <  —A.  (5.6) 

Since  we  assume  that  ip  is  even,  we  then  have 

fi(x)  =  ae~^x ,  x  >  A.  (5.7) 

Meanwhile,  for  —A  <  x  <  A,  every  even  solution  is  of  the  form 

ip(x)  =  b  cos  [y/c  —  ex)  .  (5.8) 

Proposition  5.2  Let 'p  be  the  function  defined  in  (5.6)-(5.8).  Then  there 

A 

exist  nonzero  constants  a  and  b  so  that  ip  belongs  to  the  domain  of  H  if 
and  only  if  the  following  matching  condition  holds: 

yfe  =  y/c  —  e  tan  (■ \J c  —  eA)  .  (5.9) 

Proof.  Clearly  both  ip  and  d2fi/dx2  belong  to  L2(R).  Thus,  in  light  of 
Proposition  5.1,  we  need  only  ensure  that  ip(x)  and  ip'(x)  are  continuous 
at  x  =  d=A  Since  the  exponential  functions  are  never  zero,  we  may  always 
ensure  that  itself  is  continuous  by  taking  any  value  we  like  for  b  and  then 
choosing  a  appropriately  Once  p  has  been  made  to  be  continuous,  p'  will 
be  continuous  provided  that  p>' (x)/p(x)  has  the  same  value  as  we  approach 
FA  from  inside  the  well  or  from  the  outside.  To  obtain  the  condition  (5.9), 
we  compute  pf  jip  from  (5.6)  and  then  from  (5.8),  evaluate  both  quantities 
at  x  =  —A,  and  then  equate  the  two  values  of  ip' /ip.  Because  we  have 
made  our  solution  an  even  function,  we  get  the  same  matching  condition 
at  x  =  A  as  at  x  =  —A. 

Now,  in  deriving  (5.9),  we  implicitly  assumed  that  ip  is  nonzero  at  x  = 
=LA  We  do  not,  however,  get  any  nonzero  solutions  in  which  fi>{±A)  =  0. 
After  all,  at  points  where  the  cosine  function  in  (5.8)  is  zero,  its  derivative 
is  nonzero.  But  no  choice  of  the  constant  in  front  of  the  exponentials  (5.6) 
and  (5.7)  will  produce  a  function  that  is  zero  but  has  a  derivative  that  is 
nonzero.  ■ 

Proposition  5.3  For  all  positive  values  of  c  and  A,  there  exists  at  least 
one  £  G  (0,c)  such  that  (5.9)  holds. 

Proof.  Case  1:  yficA  <  tt/2.  In  this  case,  as  e  varies  between  0  and  c, 
the  left-hand  side  of  (5.9)  will  vary  between  0  and  some  positive  number, 
whereas  the  right-hand  side  of  (5.9)  will  vary  between  some  positive  number 
and  0.  By  the  intermediate  value  theorem,  there  must  exist  e  G  (0,  c)  for 
which  (5.9)  holds.  See  Fig.  5.2. 
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Case  2:  \fcA  >  tt/2.  In  this  case,  there  is  £q  G  [0,  c]  for  which  ++  —  £c>A  = 
7r/2.  As  e  decreases  from  c  to  £o,  the  right-hand  side  of  (5.9)  will  vary  from 
0  to  +oo.  Thus,  for  e  slightly  larger  than  £q,  the  right-hand  side  of  (5.9) 
will  be  larger  than  the  left-hand  side.  By  the  intermediate  value  theorem, 
there  must  exist  £  E  (£o+)  for  which  (5.9)  holds.  See  Fig.  5.3  for  a  case 
\[cA  slightly  larger  than  tt/2  and  Fig.  5.4  for  a  case  with  ^fcA  much  larger 
than  7r/2.  ■ 

Note  that  if  \fcA  is  much  larger  than  tt/2,  then  there  will  be  multiple 
solutions  of  (5.9),  as  can  be  seen  in  Fig.  5.4. 

We  have  found,  then,  at  least  one  solution  ip  to  (5.4)  that  satisfies  the 
matching  condition  and  for  which  both  ip  and  ip"  decay  exponentially  at 
infinity.  Since  this  ip  belongs  to  the  domain  of  iF,  we  have  established  the 
following  result. 


Proposition  5.4  For  any  positive  values  of  A  and  C ,  there  exists  at  least 
one  value  of  E  in  the  range  —C<E<  0  for  which  (5.2)  has  a  nonzero 
solution  in  the  domain  of  H ,  given  by  the  formula 


ip(x)  = 


cos  (\/c  —  ex) 

cos  [\/c  —  eA)  exp[— +T(|x|  —  A)] 


—A  <  x  <  A 
x  >  A 


where  c  and  £  are  defined  in  (5.5)  and  where  e  satisfies  (5.9). 

In  Proposition  5.4,  we  have  not  normalized  ip  to  be  a  unit  vector  in 
L2(M),  but  rather  have  normalized  ip  to  equal  1  at  the  origin.  In  Figs.  5.5- 
5.7,  we  plot  our  eigenfunction  in  several  different  cases.  In  Fig.  5.5,  we  have 
a  “shallow”  well,  with  \fcA  =  1.  In  that  case,  we  obtain  only  one  even 
eigenvector,  which  is  the  ground  state  of  the  system  (i.e. ,  the  eigenvector 
with  the  smallest  eigenvalue).  Next,  we  consider  a  “deep”  well,  with  \fcA  = 
30.  For  this  well,  the  ground  state  is  shown  in  Fig.  5.5  and  an  “excited  state” 
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FIGURE  5.3.  Solving  the  matching  condition,  Case  2a. 


FIGURE  5.5.  Ground  state  for  a  shallow  potential  well. 
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FIGURE  5.6.  Ground  state  for  a  deep  potential  well. 


FIGURE  5.7.  Excited  state  for  a  deep  potential  well. 


(i.e.,  an  eigenvector  with  an  eigenvalue  that  is  not  the  smallest)  is  shown 
in  Fig.  5.7. 

Note  that  in  the  shallow  well,  the  ground  state  extends  quite  a  bit  beyond 
the  interval  [—A,  A\,  whereas  in  the  deep  well,  the  ground  state  goes  to  zero 
very  quickly  as  soon  as  we  move  outside  the  well.  On  the  other  hand,  the 
excited  state  in  Fig.  5.7  extends  comparatively  far  outside  the  well. 

It  is  straightforward  to  adapt  the  preceding  analysis  to  the  odd  case.  The 
matching  condition  (5.9)  is  replaced  by 

\fe  =  — y/ c  —  s  cot  (Vc  —  eA)  (5.10) 


(Exercise  2)  and  the  formula  for  the  eigenvectors  is  now 


ip(x)  = 


sin  (y/c  —  ex) 

±  sin  (yT  —  sA )  exp[— ^fe(\x\  —  A)} 


A  <  x  <  A 


x  >  A 


where  we  take  the  +  sign  for  x  >  A  and  the  —  sign  for  x  <  —A. 
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FIGURE  5.9.  An  odd  solution. 

If  y/cA  <  7t/2,  then  the  matching  condition  (5.10)  will  have  no  solu¬ 
tions,  since  the  right-hand  side  of  (5.10)  will  be  negative  for  all  £  E  (0,c). 
For  large  values  of  \[cA ,  there  will  be  several  solutions  to  (5.10).  A  typical 
matching  scenario  and  an  associated  eigenfunction  are  plotted  in  Figs.  5.8 
and  5.9. 


5.4  Tunneling  and  the  Classically  Forbidden 
Region 

Let  us  now  briefly  compare  the  classical  situation  to  the  quantum  one. 
Classically,  if  a  particle  has  energy  E ,  then  since  the  kinetic  energy  p2 / (2m) 
is  always  non-negative,  the  particle  simply  cannot  be  located  at  a  point  x 
with  V (x)  >  E.  Thus,  the  region  V(x)  <  E  may  be  called  the  “classically 
allowed”  region  and  the  region  V (x)  >  E  the  “classically  forbidden”  region. 
In  the  case  of  a  square  well  potential  (5.1),  if  —  C  <  E  <  0,  then  the  “well” 
itself  (i.e.,  the  region  with  —A  <  x  <  A)  is  the  classically  allowed  region 
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and  the  outside  of  the  well  (i.e.,  the  region  with  \x\  >  A)  is  the  classically 
forbidden  region. 

/\ 

Quantum  mechanically,  if  HiJj  =  E/0,  then  the  particle  has  a  definite 
value  for  the  energy,  namely  E.  We  see,  however,  that  such  a  particle  has 
a  nonzero  probability  of  being  located  in  the  classically  forbidden  region. 
Note  that  although  the  wave  function  is  not  zero  in  the  classically  forbidden 
region,  it  does  decay  exponentially  with  the  distance  from  the  classically 
allowed  region.  That  is  to  say,  the  quantum  particle  can  penetrate  some 

distance  into  the  classically  forbidden  region.  Note,  however,  that  if  E  is 

/\ 

much  less  than  zero — i.e.,  e  is  large — then  a  state  with  Hip  =  Eip  will  decay 
very  rapidly  outside  the  well  (like  exp[— y/e(\x  -A)}). 

More  generally,  we  can  think  about  the  time-dependent  Schrodinger 
equation  for  a  particle  with  energy  approximately  equal  to  E.  If  we  require 
that  the  energy  be  exactly  equal  to  E ,  then  there  is  no  interesting  time- 
dependence,  since  the  solution  to  the  time-dependent  Schrodinger  equation 
is  simply  a  constant  time  ^ o-  We  can,  however,  think  of  a  particle  where 
the  uncertainty  in  the  energy  is  nonzero  but  small.  Suppose  such  a  particle 
is  traveling  through  a  region  with  V  <  E  and  then  approaches  a  region 
with  V  E  (a  potential  barrier  ).  Classically,  the  particle  would  just 
reflect  off  of  this  barrier  and  go  back  in  the  other  direction.  Quantum  me¬ 
chanically,  though,  it  is  possible  for  the  particle  to  “tunnel”  through  the 
potential  barrier  and  come  out  the  other  side.  That  is  to  say,  at  some  later 
time,  there  will  be  some  non-negligible  portion  of  the  wave  function  on  the 
far  side  of  the  barrier. 
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Our  analysis  of  the  eigenvector  equation  (5.2)  for  —  C  <  E  <  0  shows  that 
there  are  only  finitely  many  values  of  E  in  this  range  for  which  we  get 
square-integrable  solutions.  It  is  not  hard  to  analyze  the  case  E  <  —C 
with  the  result  that  all  nonzero  solutions  grow  exponentially  in  at  least 
one  direction  (Exercise  6).  Meanwhile,  for  E  >  0,  any  solution  to  (5.2)  on 
(—oo,  —  A)  has  sinusoidal  behavior  and  is  not  square-integrable  unless  it 
is  identically  zero,  in  which  case  (by  our  matching  condition)  the  solution 
must  be  zero  everywhere. 

The  upshot  is  that  we  obtain  only  finitely  many  square-integrable  so¬ 
lutions  to  (5.2),  up  to  multiplying  each  solution  by  a  constant.  Clearly, 
then,  the  “true”  eigenvectors  for  H  [i.e.,  the  ones  that  actually  belong  to 
the  Hilbert  space  L2(M)]  cannot  form  an  orthonormal  basis  for  L2(M). 
Nevertheless,  the  spectral  theorem  (Chap.  7)  provides  something  like  a 
orthonormal-basis  decomposition  of  elements  of  L2(M)  in  terms  of  the  so¬ 
lutions  to  (5.2).  A  general  element  ip  of  L2(M)  will  be  a  sum  of  two  terms. 
The  first  term  is  a  linear  combination  of  the  true  (L2)  eigenvectors  for 
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iL,  which  have  E  <  0.  The  second  term  is  a  continuous  superposition 
(i.e.,  an  integral)  of  the  non-square-integrable  “generalized  eigenvectors” 
with  E  >  0. 

In  Chap.  9,  we  will  introduce  the  notion  of  the  spectrum  of  a  (possibly 
unbounded)  self-adjoint  operator  A.  We  will  see  that  a  number  A  belongs 
to  the  spectrum  of  A  if  for  all  £  >  0  there  exists  a  unit  vector  ip  in  the 

domain  of  A  for  which  \\Aip  —  Xip\\  <  e.  In  the  case  of  the  Hamiltonian 

/\ 

operator  H  with  a  square  well  potential,  it  is  not  hard  to  show  that  every 
real  number  E  with  E  >  0  belongs  to  the  spectrum  of  H  (Exercise  4.). 

It  can  be  shown  that  if  a  number  E  <  0  is  not  an  eigenvalue  (i.e.,  if  there 

are  no  nonzero  Zr  solutions  to  Hip  =  Eip),  then  E  is  not  an  element  of  the 

/\ 

spectrum  of  H .  This  result  is  hinted  at  by  Exercise  5.  Thus,  the  spectrum 

/\ 

of  H  consists  of  a  finite  number  of  points  in  (— C,  0)  (at  least  one),  together 
with  the  whole  half  line  [0,  oo). 


5.6  Exercises 


1.  (a)  Suppose  ip  is  a  smooth  function  on  each  of  the  intervals 

(— oo ,  — A),  (—A,  A),  and  (A,  oo)  and  that  both  ip  and  ip'  are 
continuous  at  x  =  A  and  at  x  =  —  A.  Show  that  for  any  smooth 
function  y  with  compact  support,  we  have 


x"(x)ip(x)  dx 


x(x)ip" (x)  dx , 


(5.11) 


where  we  leave  ip"(x)  undefined  at  x  =  ±A  if  the  second  deriva¬ 
tive  does  not  exist  at  those  points.  (In  light  of  Definition  A. 28, 
(5.11)  means  that  the  second  derivative  of  ip,  in  the  distribution 
sense,  is  simply  the  function  ip" .) 

Hint :  Choose  some  interval  [— R,  R\  with  R  >  A  containing  the 
support  of  y.  Now  use  integration  by  parts  separately  on  each 
of  the  intervals  [— R,  —  A],  [—A,  A],  and  [A,  R\,  paying  careful 
attention  to  the  boundary  terms. 

(b)  Suppose  now  that  ip  is  a  smooth  function  on  each  of  the  inter¬ 
vals  (— oo,  —  A),  (—A,  A),  and  (A,  oo),  and  that  both  ip  and  ipf 
have  left  and  right  limits  at  x  =  ±A,  but  that,  say,  ip'  has  a 
discontinuity  at  x  =  —A.  Show  that  (5.11)  has  to  be  modified 
by  adding  a  nonzero  multiple  of  y(— A)  to  the  right-hand  side. 


2.  Verify  the  matching  condition  (5.10)  for  odd  solutions  of  the  time- 
independent  Schrodinger  equation. 


3.  Let  w  be  a  nonzero  real  number  and  consider  a  function  of  the  form 

ip(x)  =  acos(ccx)  +  b  sin(cjx), 
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for  real  numbers  a  and  b.  If  a  and  b  are  not  both  zero,  show  that  for 
any  A  G  R,  we  have 


fB 

lim  /  ib(x)2  dx  =  +00. 

JA 


4.  Let  /  be  a  C°°  function  on  the  interval  (0, 1)  with  the  property  that 
f(x)  =  lfor0<x<l/3  and  f(x)  =  0for2/3<x<  1.  Then  define 
a  family  of  “cutoff’  functions  Xn  on  R  by  the  formula 


Xn(x) 


>  n  +  1 


\x\  <  n 

—  (n  +  1)  <  x  <  —n 
n  <  x  <  n  +  1 


Given  any  E  >  0,  let  ^  be  a  nonzero  solution  to  (5.2)  for  which  0(x) 
and  0'(x)  are  continuous  at  x  =  ±A.  Let  ^ n  =  x/j \n •  Show  that 
belongs  to  the  domain  of  H  and  that 


lim 

n— ^  00 


<3 

1 

3 

1 0n  | 

0. 


Note :  As  we  will  see  in  Chap.  9,  this  implies  that  every  real  number 

A 

E  with  E  >  0  belongs  to  the  spectrum  of  the  operator  H . 

Hint :  In  estimating  HV’nlh  it  may  be  helpful  to  apply  Exercise  3  to 
the  real  and  imaginary  parts  of  -0  outside  the  well. 

5.  Suppose  E  <  0  and  suppose  that  there  exists  no  nonzero  square- 
integrable  solutions  to  (5.2)  for  which  0  and  0'  are  continuous.  Let  0 
be  a  nonzero  solution  of  (5.2)  for  which  0(x)  and  0/(x)  are  continuous 
at  x  =  =b A  and  let  0n  be  as  in  Exercise  4.  Show  that 


Hip, 

a  Hipn 

1 0n  | 

does  not  tend  to  zero  as  n  tends  to  infinity. 


6.  (a)  Show  that  for  E  <  —  C,  there  are  no  nonzero  square-integrable 

solutions  to  (5.2)  for  which  0  and  0'  are  continuous. 

(b)  Obtain  the  result  of  Part  (a)  when  E  =  —C. 

Hint :  Analyze  the  even  and  odd  cases  separately. 


7.  Let  the  ground  state  for  a  particle  in  a  square  well  denote  the  eigen¬ 
vector  with  the  lowest  (most  negative)  eigenvalue,  which  corresponds 
to  the  largest  value  for  e. 
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(a)  Show  that  the  ground  state  is  always  an  even  function.  That  is 
to  say,  show  that  the  largest  value  of  e  satisfying  (5.9)  is  always 
larger  than  any  solution  to  (5.10). 

(b)  Show  that  the  ground  state  is  a  nowhere-zero  function. 


6 

Perspectives  on  the  Spectral  Theorem 


6.1  The  Difficulties  with  the  Infinite-Dimensional 
Case 

Suppose  A  is  a  self-adjoint  n  x  n  matrix,  meaning  that  Ay3  =  Ajy  for  all 
1  <  j,k  <  n.  Then  a  standard  result  in  linear  algebra  asserts  that  there 
exist  an  orthonormal  basis  {vJ}^=1  for  Cn  and  real  numbers  Ai,...,An 
such  that  Avj  =  \3  w3 .  (See  Theorem  18  in  Chap.  8  of  [24]  and  Exercise  4 
in  Chap.  7.) 

We  may  state  the  same  result  in  basis-independent  language  as  follows. 
Suppose  H  is  a  finite-dimensional  Hilbert  space  and  A  is  a  self-adjoint 
linear  operator  on  H,  meaning  that  (</>,  Aifi)  =  (Acf),^)  for  all  (/>,  ip  E  H. 
Then  there  exists  an  orthonormal  basis  of  H  consisting  of  eigenvectors  for  A 
with  real  eigenvalues. 

Since  there  is  a  standard  notion  of  orthonormal  bases  for  general  Hilbert 
spaces,  we  might  hope  that  a  similar  result  would  hold  for  self-adjoint 
operators  on  infinite-dimensional  Hilbert  spaces.  Simple  examples,  however, 
show  that  a  self-adjoint  operator  may  not  have  any  eigenvectors.  Consider, 
for  example,  H  =  L2([0, 1])  and  an  operator  A  on  H  defined  by 

(A,ip)(x)  =  x,ip(x).  (6.1) 

Then  A  satisfies  (< />,  A0 )  =  (Acj),^)  for  all  ^  G  L2([0,1]),  and  yet  A 
has  no  eigenvectors.  After  all,  if  xfip(x)  =  A^(x),  then  ip  would  have  to  be 
supported  on  the  set  where  x  =  A,  which  is  a  set  of  measure  zero.  Thus, 
only  the  zero  element  of  L2([0, 1])  satisfies  Aip  =  A^. 
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Now,  a  physicist  would  say  that  the  operator  A  in  (6.1)  does  have 
eigenvectors,  namely  the  distributions  S(x  —  A).  (See  Appendix  A. 3. 3.) 
These  distributions  indeed  satisfy  xS(x  —  A)  =  XS(x  —  A),  but  they  do 
not  belong  to  the  Hilbert  space  L2([ 0, 1]).  Such  “eigenvectors,”  which  be¬ 
long  to  some  larger  space  than  H,  are  known  as  generalized  eigenvectors. 
Even  though  these  generalized  eigenvectors  are  not  actually  in  the  Hilbert 
space,  we  may  hope  that  there  is  some  sense  in  which  they  form  something 
like  a  orthonormal  basis.  See  Sect.  6.6  for  an  example  of  how  such  a  “basis” 
might  function. 

Let  us  mention  in  passing  that  our  simple  expectation  of  a  true  orthonor¬ 
mal  basis  of  eigenvectors  is  realized  for  compact  self-adjoint  operators, 
where  an  operator  A  on  H  is  said  to  be  compact  if  the  image  under  A  of 
every  bounded  set  in  H  has  compact  closure;  see  Theorem  VI.  16  in  Vol¬ 
ume  I  of  [34].  The  operators  of  interest  in  quantum  mechanics,  however, 
are  not  compact.  (Of  course,  even  if  a  self-adjoint  operator  is  not  compact, 
it  might  still  have  an  orthonormal  basis  of  eigenvectors,  as,  e.g.,  in  the  case 
of  the  Hamiltonian  operator  for  a  harmonic  oscillator.  See  Chap.  11.) 

Meanwhile,  there  is  another  serious  difficulty  that  arises  with  self-adjoint 
operators  in  the  infinite-dimensional  case.  Most  of  the  self-adjoint  operators 
A  of  quantum  mechanics  are  unbounded  operators,  meaning  that  there  is 
no  constant  C  such  that  \\Aip\\  <  C'||/0||  for  all  ip.  Suppose,  for  example, 
that  A  is  the  position  operator  X  on  L2(M),  given  by  (Xip)(x)  =  xip{x).  If 
1  e  denotes  the  indicator  function  of  E  (the  function  that  is  1  on  E  and  0 
elsewhere),  then  it  is  apparent  that 


^-"1  [n,n+ 1] 


>  n  1 


[n,n+l] 


for  every  positive  integer  n,  and,  thus,  X  cannot  be  bounded.  Now,  using 
the  closed  graph  theorem  and  elementary  results  from  Sect.  9.3,  it  can  be 
shown  that  if  A  is  defined  on  all  of  H  and  satisfies  (</>,  Aip)  =  (A</>,  ip)  for 
all  </>,  ip  G  H,  then  A  must  be  bounded.  (See  Corollary  9.9.)  Thus,  if  A  is 
unbounded  and  self-adjoint,  it  cannot  be  defined  on  all  of  H. 

We  define,  then,  an  “unbounded  operator  on  H  ”  to  be  a  linear  operator 
from  a  dense  subspace  of  H — known  as  the  domain  of  A — to  H.  The  no¬ 
tion  of  self-adjointness  for  such  operators  is  more  complicated  than  in  the 
bounded  case.  The  obvious  condition,  that  (</>,  Aip)  should  equal  (A0,  ip)  for 
all  <p  and  ip  in  the  domain  of  A ,  is  not  the  “right”  condition.  Specifically, 
that  condition  is  not  sufficient  to  guarantee  that  the  spectral  theorem  ap¬ 
plies  to  A.  Rather,  for  any  unbounded  operator  A ,  we  will  define  the  adjoint 
A*  of  A ,  which  will  be  an  unbounded  operator  with  its  own  domain.  An 
unbounded  operator  is  then  defined  to  be  self-adjoint  if  the  domains  of  A 
and  A*  are  the  same  and  A  and  A*  agree  on  their  common  domain.  That 
is  to  say,  self-adjointness  means  not  only  that  A  and  A*  agree  whenever 
they  are  both  defined,  but  also  that  the  domains  of  A  and  A*  agree. 
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6.2  The  Goals  of  Spectral  Theory 

Before  getting  into  the  details  of  the  spectral  theory,  let  us  think  for  a 
moment  about  what  it  is  we  want  the  spectral  theorem  to  do  for  us.  In  the 
first  place,  we  would  like  the  spectral  theorem  to  allow  us  to  apply  various 
functions  to  an  operator.  We  saw,  for  example,  that  the  time-dependent 

Schrodinger  equation  can  be  “solved”  by  setting  0(t)  =  exp{—itH/H}ipo. 

/\ 

Because  the  Hamiltonian  operator  H  is  unbounded,  it  is  not  convenient 
to  use  power  series  to  define  the  exponential.  If,  however,  H  has  a  true 

orthonormal  basis  {ek}  of  eigenvectors  with  corresponding  eigenvalues  An, 

/\ 

then  we  can  define  exp{— itH/H}  to  be  the  unique  bounded  operator  with 
the  property  that 

e-itH/Kek  =  e-it^k/Kek 

for  all  k. 

/N 

In  cases  where  H  does  not  have  a  true  orthonormal  basis  of  eigenvectors, 
we  would  like  the  spectral  theorem  to  provide  a  “functional  calculus”  for 
P,  that  is,  a  system  for  applying  functions  (including  exponentials)  to  H . 
This  functional  calculus  should  have  properties  similar  to  what  we  have  in 
the  case  of  a  true  orthonormal  basis  of  eigenvectors. 

In  the  second  place,  we  would  like  the  spectral  theorem  to  provide  a 
probability  distribution  for  the  result  of  measuring  a  self-adjoint  opera¬ 
tor  A.  Let  us  recall  how  measurement  probabilities  work  in  the  case  that 
A  has  a  true  orthonormal  basis  {ej}  of  eigenvectors  with  eigenvalues  A j. 
Building  on  Example  3.12,  we  may  compute  the  probabilities  in  such  a  case 
as  follows.  Given  any  Borel  set  E  of  R,  let  Ve  be  the  closed  span  of  all  the 
eigenvectors  for  A  with  eigenvalues  in  P,  and  let  Pe  be  the  orthogonal 
projection  onto  Ve-  Then  for  any  unit  vector  0,  we  have 

prob^(A  G  E)  =  (0,  Pe0)  .  (6.2) 

In  particular,  if  the  eigenvalues  are  distinct  and  0  decomposes  as  0  = 
cj ej 5  the  probability  of  observing  the  value  X3  will  be  \cj\2  (as  in  Ex¬ 
ample  3.12),  since  P{\;]}  is  just  the  projection  onto  e3. 

In  cases  where  A  does  not  have  a  true  orthonormal  basis  of  eigenvectors, 
we  would  like  the  spectral  theorem  to  provide  a  family  of  projection  oper¬ 
ators  Pe,  one  for  each  Borel  subset  E  C  R,  which  will  allow  us  to  define 
probabilities  as  in  (6.2).  We  will  call  these  projection  operators  spectral 
projections  and  the  associated  subspaces  Ve  spectral  subspaces.  (Thus,  Pe 
is  the  orthogonal  projection  onto  Ve-)  Intuitively,  Ve  may  be  thought  of  as 
the  closed  span  of  all  the  generalized  eigenvectors  with  eigenvalues  in  E. 

In  the  first  version  of  the  spectral  theorem,  both  these  goals  will  be 
achieved,  with  the  spectral  projections  being  provided  by  a  projection- 
valued  measure  and  the  functional  calculus  being  provided  by  integration 
with  respect  to  this  measure.  Although  having  (generalized)  eigenvectors 
for  a  self-adjoint  operator  is,  from  a  practical  standpoint,  of  secondary 
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importance,  we  provide  a  framework  for  understanding  such  eigenvectors, 
using  the  concept  of  a  direct  integral.  The  second  version  of  the  spectral 
theorem  decomposes  the  Hilbert  space  H  as  a  direct  integral,  with  respect 
to  a  certain  measure  /i,  of  generalized  eigenspaces  for  a  self-adjoint  oper¬ 
ator  A.  The  generalized  eigenspace  for  a  particular  eigenvalue  A  will  not 
actually  be  a  subspace  of  H,  unless  //({A})  >  0.  Thus,  the  notion  of  a  direct 
integral  gives  a  rigorous  meaning  to  the  notion  of  “eigenvectors”  that  are 
not  actually  in  the  Hilbert  space. 


6.3  A  Guide  to  Reading 

Although  the  portion  of  this  book  devoted  to  spectral  theory  is  unavoidably 
technical  in  places,  it  has  been  designed  so  that  the  reader  can  take  in  as 
much  or  as  little  as  desired.  The  reader  who  is  willing  to  take  things  on  faith 
can  simply  take  in  the  examples  of  the  position  and  momentum  operators 
in  Sects.  6.4  and  6.6  and  accept  these  as  prototypes  of  how  the  spectral 
theorem  works.  The  reader  who  wants  more  details  can  find  the  statement 
of  the  spectral  theorem  for  bounded  operators,  in  two  different  forms,  in 
Chap.  7,  and  can  find  the  basics  of  unbounded  self-adjoint  operators  in 
Chap.  9.  Finally,  the  reader  who  wants  a  complete  treatment  of  the  subject 
can  find  full  proofs  of  the  spectral  theorem  in  both  forms,  first  for  bounded 
operators  in  Chap.  8,  and  then  for  unbounded  operators  in  Chap.  10. 


6.4  The  Position  Operator 

As  our  first  example,  let  us  consider  the  position  operator  A,  given  by 
(Xip)(x)  =  xip(x),  acting  on  the  Hilbert  space  H  =  L2(M).  As  for  the 
similar  operator  in  Sect.  6.1,  A  has  no  true  eigenvectors,  that  is,  no  eigen¬ 
vectors  that  are  actually  in  H.  If  we  think  that  the  generalized  eigenvectors 
for  A  are  the  distributions  S(x  —  A),  A  E  R,  then  we  may  make  an  educated 
guess  that  the  spectral  subspace  Ve  should  consist  of  those  functions  that 
“supported”  on  E,  that  is,  those  that  are  zero  almost  everywhere  on  the 
complement  of  E.  (A  superposition  of  the  “functions”  S(x  —  A),  with  A  E  E, 
should  be  a  function  supported  on  E.) 

The  spectral  projection  Pe  is  then  the  orthogonal  projection  onto  Ve, 
which  may  be  computed  as 


Pe^P  =  1 E^? 

where  1e  is  the  indicator  function  of  E.  In  that  case,  we  have,  follow- 
ing  (6.2), 


proby,  (X  e  E)  =  (ip,  PEip) 


^(t)|2  dx. 
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This  formula  is  just  what  we  would  have  expected  from  our  discussion  in 
Chap.  3,  where  we  claimed  that  the  probability  distribution  for  the  position 
of  the  particle  is  \ip(x)\2 . 

Meanwhile,  let  us  consider  the  functional  calculus  for  X.  If  /(A)  =  Am, 
then  /(X)  should  be  just  the  rath  power  of  X,  which  is  multiplication  by 
x171 .  It  seems  reasonable,  then,  to  think  that  for  any  function  /,  we  should 
define  /(X)  to  be  simply  multiplication  by  f(x).  In  particular,  the  operator 
eiaX  ghouls  he  simply  multiplication  by  eiax ,  which  is  a  bounded  operator 
on  L2(M). 


6.5  Multiplication  Operators 

Since  the  position  operator  acts  simply  as  multiplication  by  the  function 
x,  it  is  straightforward  to  find  the  spectral  subspaces  and  also  to  construct 
the  functional  calculus  for  X.  We  may  consider  multiplication  operators  in 
a  more  general  setting.  If  H  =  L2(X,  /i)  and  h  is  a  real- valued  measurable 
function  on  X,  then  we  may  define  the  multiplication  operator  Mh  on 
L2(X,V)  by 

Mh'ip  =  hip. 

We  can  then  construct  spectral  subspaces  as 

Ve  =  {ip\ip  is  supported  on  h~1(E) } 
and  define  a  functional  calculus  by 

f(A )  =  multiplication  by  /  o  h. 

One  form  of  spectral  theorem  may  now  be  stated  simply  as  follows:  A 
self-adjoint  operator  A  on  a  separable  Hilbert  space  is  unitarily  equivalent 
to  a  multiplication  operator.  That  is  to  say,  there  is  some  cr- finite  mea¬ 
sure  space  (X,  /i)  and  some  measurable  function  h  on  X  such  that  A  is 
unitarily  equivalent  to  multiplication  by  h.  (See  Theorem  7.20.)  Although 
this  version  of  the  spectral  theorem  is  compellingly  easy  to  state,  there  is 
slight  modification  of  it,  involving  direct  integrals,  that  is  in  some  ways 
even  better.  See  Sect.  7.3  for  more  information. 


6.6  The  Momentum  Operator 

Let  us  now  see  how  the  spectral  theorem  works  out  in  the  case  of  the 
momentum  operator,  P  =  —ih  d/dx  on  L2(M).  The  “eigenvectors”  for 
P  are  the  functions  elkx ,  k  E  R,  with  the  corresponding  eigenvalues  be¬ 
ing  hk.  Although  the  functions  elkx  are  not  in  L2(M),  the  Fourier  trans¬ 
form  shows  that  any  function  in  L2  (R)  can  be  expanded  as  a  superposition 
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(i.e.,  continuous  version  of  a  linear  combination)  of  these  functions.  (See 
Appendix  A. 3. 2.)  Indeed,  the  Fourier  transform  is  very  much  like  the  de¬ 
composition  of  a  vector  in  an  orthonormal  basis,  in  that  the  Fourier  coeffi- 
cients  'ip(k)  can  be  expressed  in  terms  of  the  “inner  product”  of  a  function 
with  elkx\ 

/oo 

e~lkxtp(x)  dx  =  (27 r)_1/2  {elkx ,  ip)L2(R)  , 

-OO 


if  we  ignore  the  fact  that  elkx  is  not  actually  in  L 2. 

Indeed,  physicists  frequently  understand  the  Fourier  transform  by  assert¬ 
ing  that  the  functions  elkx  / ^/2tt  form  an  “orthonormal  basis  in  the  contin¬ 
uous  sense”  for  L2(M).  Orthonormality  in  the  continuous  sense  is  supposed 
to  mean  that  one  replaces  the  usual  Kronecker  delta  in  the  definition  of  an 
orthonormal  set  by  the  Dirac  S -function 

j  pikx  pilx  \ 

tev A„m=stk-l)-  (63) 

where  S  is  supposed  to  satisfy 


f(k)S(k  -  l)  dk  =  f(l) 


for  all  continuous  functions  /.  (Rigorously,  5(k  —  l)  is  a  distribution;  see 
Appendix  A. 3. 3.) 

To  give  some  rigorous  meaning  to  (6.3),  note  that  although  the  inner 
product  of  elkx  and  ellx  is  not  defined,  we  may  approximate  this  inner 
product  by  the  expression 


1 

27T 


dx 


1 


g  —  i{k  —  l)x 


A 


2tt  —i{k  —  l ) 


—A 


A  sin  [A(k  —  /)] 
7 r  A(k  —  l ) 


It  is  possible  to  show  that  the  above  function,  viewed  as  a  function  of  k  for 
fixed  A  and  /,  behaves  like  S(k  —  l)  in  the  limit  as  A  tends  to  infinity.  That 
is  to  say,  for  all  sufficiently  nice  functions  we  have 

lim  /  ^(k) - —7z - — —dk  =  'ipU).  (6-4) 

^oo7_oorW7T  A{k-l) 


Here  is  a  heuristic  argument  for  (6.4).  By  making  the  change  of  variable 
k'  =  k  —  Z,  we  may  reduce  the  general  problem  to  the  case  l  =  0.  If  we  then 
make  the  change  of  variable  n  =  Ak ,  the  desired  result  is  equivalent  to 


*oo 


lim 

A— )>+oo  /_ 


oo 


1  sinn 


7 r  n 


/(^)  dn  =  f( 0) 
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Now,  if  we  can  bring  the  limit  inside  the  integral,  f{n/ A)  will  tend  to  /( 0) 
as  A  tends  to  infinity.  Since  the  rest  of  the  integrand  on  the  right-hand 
side  of  (6.5)  is  already  independent  of  A,  the  result  would  then  follow  if  we 
could  show  that 


•oo 


■oo 


1  sin  ft 


dft  =  1. 


7 r  ft 


(6.6) 


Even  though  the  integral  in  (6.6)  is  not  absolutely  convergent,  it  is  a  con¬ 
vergent  improper  integral.  The  value  of  the  integral  can  be  obtained  by  the 
method  of  contour  integration  (or  the  method  of  consulting  a  table  of  in¬ 
tegrals),  and  indeed  (6.6)  holds.  Since  (6.3)  is,  in  any  case,  only  a  heuristic 
way  of  thinking  about  the  Fourier  transform,  we  will  not  take  the  time  to 
develop  a  rigorous  version  of  the  preceding  argument. 

It  is  possible  to  derive,  at  least  formally,  many  of  the  standard  properties 
of  the  Fourier  transform  by  using  (6.3),  just  as  one  can  obtain  properties 
of  Fourier  series  by  using  the  orthonormality  of  the  functions  e27Tinx  in 
L2([0, 1]).  More  importantly,  the  Fourier  transform  is  precisely  the  unitary 
transformation  that  changes  the  momentum  operator  into  a  multiplication 
operator.  To  see  this  property  of  the  Fourier  transform  more  clearly,  we 
introduce  a  simple  rescaling  of  it. 

Definition  6.1  For  any  0  E  L2(M),  define  0  by 


so  that 


'4G 


Vh  \h. 


1 


1 

a/27 tH 


-ipx/h^p 


x)  dx. 


The  function  0(p)  is  the  momentum  wave  function  associated  with  0. 

By  the  Plancherel  theorem  (Theorem  A.  19)  and  a  change  of  variable,  if  0 
is  a  unit  vector,  then  so  is  0  and  also  0.  For  any  unit  vector  0,  we  interpret 
|0(p)|2  as  the  probability  density  for  the  momentum  of  the  particle,  just  as 
\ip(x)\2  is  the  probability  distribution  of  the  position  of  the  particle.  Using 
Proposition  A.  17,  we  may  readily  verify  that  for  nice  enough  0,  we  have 


Ppj(p)  =  ppj(p).  (6.7) 

Equation  (6.7)  means  that  the  unitary  map  0^0  turns  the  momentum 
operator  P  into  multiplication  by  p.  That  is  to  say,  the  spectral  theorem, 
in  its  “multiplication  operator”  form,  is  accomplished  in  this  case  by  the 
Fourier  transform  (scaled  as  in  Definition  6.1). 

In  terms  of  the  momentum  wave  function,  we  may  define  spectral  pro¬ 
jections  and  a  functional  calculus  for  P,  just  as  in  Sect.  6.5.  For  any  Borel 
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set  E  C  R,  we  may  define  a  projection  Pe  to  be  the  orthogonal  projection 
onto  to  the  space  of  functions  i/j  for  which  ^(p)  is  zero  almost  everywhere 
outside  of  E.  If  /  is  any  bounded  measurable  function  on  M,  we  can  define 
an  operator  f(P)  by  defining  f(P)^  to  be  the  unique  element  of  L2(M)  for 
which 

f(p)tp(p)  =  f(p)tp(p). 


7 

The  Spectral  Theorem  for  Bounded 
Self-Adjoint  Operators:  Statements 


In  the  present  chapter,  we  will  consider  the  spectral  theorem  for  bounded 
self-adjoint  operators,  leaving  a  discussion  of  unbounded  operators  to 
Chaps.  9  and  10.  The  proofs  of  the  main  theorems  (two  different  versions 
of  the  spectral  theorem)  are  moderately  long  and  are  deferred  to  Chap.  8. 
After  some  elementary  definitions  and  results  in  Sect.  7.1,  we  come  to  the 
main  results  in  Sects.  7.2  and  7.3.  Throughout  the  chapter,  H  will,  as  usual, 
denote  a  separable  Hilbert  space  over  C. 


7.1  Elementary  Properties  of  Bounded  Operators 


As  usual,  we  will  let  H  denote  a  separable  complex  Hilbert  space.  Recall 
from  Appendix  A. 3. 4  that  a  linear  operator  A  on  H  is  said  to  be  bounded 
if  the  operator  norm  of  A, 

||A||  :=  sup  ^  ^  (7.1) 

yeH\{o}  Il'Cll 

is  finite.  The  space  of  bounded  operators  on  H  forms  a  Banach  space  under 
the  operator  norm,  and  we  have  the  inequality 


AB ||  <  \\A\\  || B 


for  all  bounded  operators  A  and  B. 


Definition  7.1  The  Banach  space  of  bounded  operators  on  H,  with  respect 
to  the  operator  norm  (7.1),  is  denoted  H). 
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Recall  (Appendix  A. 4. 3)  that  for  any  A  £  Z3( H)  there  is  a  unique  operator 
A*  G  /3(H),  called  the  adjoint  of  A ,  such  that 

=  {A  *<j),ip) 

for  all  (fii'ip  £  H.  An  operator  A  £  /3(H)  is  called  self-adjoint  if  A*  =  A. 
We  say  that  A  G  /3(H)  is  non-negative  if 

(Vh  A'i/j)  >  0  (7.3) 

for  all  ijj  £  H. 


Proposition  7.2  For  all  A  e  £>(H),  we  /ume 

||  A*  ||  =  ||  A|| 

and 

\\A*A\\  =  \\A\\2  . 

In  particular,  if  A  is  self-adjoint,  we  have  the  useful  result  that 

p||2. 

Proof.  The  operator  norm  of  A  can  also  be  computed  as 

1 1 A 1 1  =  sup  ||A^|| . 
l=i 


A2 


Furthermore,  for  any  vector  <fi  £  H,  \\(j)\\  =  sup||x||=1  |(y,  0)|.  (Inequality 
one  direction  is  by  the  Cauchy-Schwarz  inequality,  and  inequality  the  other 
direction  is  by  taking  y  to  be  a  multiple  of  </>.)  Thus, 


sup 

11011  =  11011=1 


(<MP 


From  this,  we  get 


sup 

11011  =  11011  =  1 

sup 

11011  =  11011  =  1 
sup 

11011  =  11011  =  1 


(<M» 

{Act),  ip) 
(tp,A<p)  | 


Meanwhile,  p*A||  <  p*||  p| 

P*P|  = 


2 

|  A ||  .  On  the  other  hand, 

{4>,A*Aip)\ 


sup 

=  11011=1 


=  sup  |(A0,  A'i/j) 
11011  =  11011  =  1 

>  sup  |  ( Agb,  Agb)  | 

11011  =  1 
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which  establishes  the  inequality  in  the  other  order.  ■ 

We  now  record  an  elementary  but  very  useful  result. 

Proposition  7.3  For  all  A  G  B{ H),  we  have 

[Range(A)]1'  =  ker(A*), 

where  for  any  B  E  6(H),  ker (B)  denotes  the  kernel  of  B. 

Proof.  Suppose  first  that  if  belongs  to  [Range(^4)]±.  Then  for  all  <fi  E  H, 
we  have 

o  =  {i>,  A(f>)  =  {A*ip,  4>) .  (7.4) 

This  implies  that  A*  if  =  0  and  thus  that  if  E  ker(A*).  Conversely,  suppose 
if  E  ker  (A*).  Then  for  all  <f  E  H,  (7.4)  holds  (reading  the  equation  from 
right  to  left).  This  shows  that  if  is  orthogonal  to  every  element  of  the  form 
Af> ,  meaning  that  if  E  [Range(A)]±.  ■ 

Next,  we  define  the  spectrum  of  a  bounded  operator,  which  plays  the 
same  role  as  the  set  of  eigenvalues  in  the  finite-dimensional  case. 

Definition  7.4  For  A  E  6(H),  the  resolvent  set  of  A,  denoted  p(A) 
is  the  set  of  all  A  E  C  such  that  the  operator  ( A  —  XI)  has  a  bounded 
inverse.  The  spectrum  of  A,  denoted  by  cr(A),  is  the  complement  in  C  of 
the  resolvent  set.  For  X  in  the  resolvent  set  of  A,  the  operator  ( A  —  A/)-1 
is  called  the  resolvent  of  A  at  X. 

Saying  that  (A  —  XI)  has  a  bounded  inverse  means  that  there  exists  a 
bounded  operator  B  such  that 

(A  -  X I)B  =  B(A  -  XI)  =  I. 


If  A  is  bounded  and  A  —  XI  is  one-to-one  and  maps  H  onto  H,  then  it 
follows  from  the  closed  graph  theorem  (Theorem  A. 39)  that  the  inverse 
map  must  be  bounded.  Thus,  the  resolvent  set  of  A  can  alternatively  be 
described  as  the  set  of  A  E  C  for  which  A  —  XI  is  one-to-one  and  onto. 


Proposition  7.5  For  all  A  G  23(H),  the  following  results  hold. 

1.  The  spectrum  o(A)  of  A  is  a  closed,  bounded,  and  nonempty  subset 
of  C. 


2.  If  |  A  |  >  ||A||,  then  X  is  in  the  resolvent  set  of  A. 


Lemma  7.6  Suppose  X  E  6(H)  satisfies  \\X\\  <  1.  Then  the  operator 
I  —  X  is  invertible,  with  the  inverse  given  by  the  following  convergent  series 
in  6(H): 


(I  -  X)-1  =  I  +  X  +  X2  -f  X3  +  •  •  • 


(7.5) 
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Proof.  As  a  consequence  of  (7.2),  we  have  ||Xm||  <  ||X||m.  The  (geometric) 
series  on  the  right-hand  side  of  (7.5)  is  therefore  absolutely  convergent  and 
thus  convergent  in  the  Banach  space  B( H)  (Appendix  A. 3. 4).  If  we  multiply 
this  series  on  either  side  by  (/—X),  everything  will  cancel  except  /,  showing 
that  the  sum  of  the  series  is  the  inverse  of  (I  —  X).  ■ 

Proof  of  Proposition  7.5.  For  any  nonzero  A  E  C,  consider  the  operator 


A  -  XI 


If  | A |  >  || A ||,  then  ||A/A||  <  1,  and  I  —  A/ A  is  invertible  by  the  lemma.  It 
then  follows  that  A  —  XI  is  invertible,  with 


(A  -  A  I)-1 


1 

A 


A2 

A2 


Thus,  A  is  in  the  resolvent  set  of  A.  This  establishes  Point  2  in  the  propo¬ 
sition  and  shows  that  cr{A)  is  bounded. 

Suppose  now  that  Ao  €  C  is  in  the  resolvent  set  of  A.  Then  for  another 
number  A  G  C,  we  have 


A-  XI  =  A-  X0I  -  (A  -  A0)/ 

=  (A  -  Ao  /)  (/  -  (A  -  Ao)  (A  -  Ao/)"1).  (7.7) 

Thus,  if 

A°l  <  ||(A-  Ao/)-1 1|  ’ 

both  factors  on  the  right-hand  side  of  (7.7)  will  be  invertible,  so  that  A  — XI 
is  also  invertible.  Thus,  the  resolvent  set  of  A  is  open  and  the  spectrum  is 
closed. 

To  show  that  cr(A)  is  nonempty,  note  that  A  —  XI  may  be  computed  as 
follows: 


(A  -  XI)-1  =  (/  -  (A  -  Ao )(A  -  Ao/VTV  -  A0/) 


-1\-1 


-1 


oo 


12  (A  —  Ao)m((/l  -  Ao /)-T  (A  -  Ao/) 


-1 


(7.8) 


.m— 0 


Thus,  near  any  point  Ao  in  the  resolvent  set  of  A ,  the  resolvent  (A  —  XI)  ~1 
can  be  computed  by  the  locally  convergent  series  (7.8)  in  powers  of  A  —  Ao, 
with  the  coefficients  of  the  series  being  elements  of  £>( H).  For  any  0,-0  E  H, 
the  map 

A  (0,  (A  —  A/)-10)  (7.9) 

will  be  given  by  a  locally  convergent  power  series  with  coefficients  in  C, 
meaning  that  the  function  (7.9)  is  a  holomorphic  function  on  the  resolvent 
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set  of  A.  Furthermore,  from  (7.6)  we  can  see  that  |  (A  —  A/)-1  tends  to 
zero  as  |A|  tends  to  infinity,  and  so  also  does  the  right-hand  side  of  (7.9). 

If  cr(A)  were  the  empty  set,  the  function  (7.9)  would  be  holomorphic 
on  all  of  C  and  tending  to  zero  at  infinity.  By  Liouville’s  theorem,  the 
right-hand  side  of  (7.9)  would  have  to  be  identically  zero  for  all  <p  and 
ip,  which  would  mean  that  (A  —  A/)-1  is  the  zero  operator.  But  since 
(. A  —  XI) (A  —  A/)-1  =  I,  the  operator  ( A  —  A/)-1  cannot  be  zero.  ■ 

If  Aip  =  Xp)  for  some  A  £  C  and  some  nonzero  pj  £  H,  then  {A  —  XI)  has 
a  nonzero  kernel  and  so  A  is  in  the  spectrum  of  A.  Thus,  any  eigenvalue 
for  A  is  contained  in  the  spectrum  of  A.  In  the  infinite-dimensional  case, 
however,  the  converse  is  not  true:  A  point  in  the  spectrum  may  not  be  an 
eigenvalue  for  A.  Nevertheless,  for  a  bounded  self-adjoint  operator  A ,  the 
spectrum  of  A  may  be  described  in  a  way  that  is  not  too  far  removed  from 
what  we  have  in  the  finite-dimensional  case. 


Proposition  7.7  If  A  £  B( H)  is  self-adjoint,  then  the  following  results 
hold. 

1.  The  spectrum  of  A  is  contained  in  the  real  line. 

2.  A  number  A  G  M  belongs  to  the  spectrum  of  A  if  and  only  if  there 
exists  a  sequence  pjn  of  nonzero  vectors  in  H  such  that 


lim 

n— ^oo 


\A.p)n  XpJn 

W'Pn  || 


0. 


(7.10) 


Condition  2  in  the  proposition  says  that  A  G  R  belongs  to  the  spectrum 
if  and  only  if  A  is  “almost  an  eigenvalue,”  meaning  that  there  exists  pj  0 
for  which  Aip  is  equal  to  Xip  plus  an  error  that  is  small  compared  to  the 
size  of  pj. 

Lemma  7.8  If  A  £  m  h)  is  self-adjoint,  then  for  all  X  =  a  A  ib  £  C,  we 
have 

({A  -  XI)ip,  (A  -  XI)ip)  >  b 2  (ip,  ip) .  (7.11) 

Proof.  We  compute  that 

((A  —  (a  +  ib)I)ip ,  (A  —  (a  A  ib)I)ip) 

=  (( A  —  al)fj,  ( A  —  al)ip)  +  ib  pip,  ( A  —  al)pj) 

—  ib  ((A  —  al)ip,  !p)  A  b2  pip,  ip) .  (7.12) 

Since  A  is  self-adjoint,  so  is  A  —  al,  from  which  we  see  that  the  second  and 
third  terms  on  the  right-hand  side  of  (7.12)  cancel,  leaving  us  with 

{(A  -  \I)iP,  (A  -  XI)ip)  =  {(A  -  al)ip,  (A  -  aI)i/>)  +  b2  (t/>,  </>) , 


from  which  the  desired  inequality  follows.  ■ 
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Proof  of  Proposition  7.7.  For  Point  1,  we  need  to  show  that  any  complex 
number  A  =  a  +  ib  with  6  ^  0  belongs  to  the  resolvent  set  of  A.  Since 
b  0,  (7.11)  shows  that  A  —  XI  is  injective.  Meanwhile,  by  Proposition  7.3, 
Range(A  —  A  I)1-  =  ker(A  —  XI).  Since  A  also  has  nonzero  imaginary  part, 
A  —  XI  is  injective,  and  so  the  range  of  A  —  XI  is  dense  in  H.  To  show 
that  the  range  is  all  of  H,  consider  any  <f  G  H  and  choose  a  sequence 
4>n  =  (A  —  XI)ijn  in  Range(A  —  XI)  with  <f>n  — >  </>.  Applying  (7.11)  with  i/j 
replaced  by  fjn  —  ifm  shows  that  (ipn)  is  a  Cauchy  sequence.  Thus,  ifn 
for  some  i/j  G  H.  Since  A  is  bounded, 

(. A  —  X I)if  =  lim  ( A  —  X I)j)n  =  lim  (j)n  =  <f>. 

n— t>oo  n— ?>oo 

We  conclude,  then,  that  A  — XI  is  one-to-one  and  onto.  The  inverse  operator 
(. A  —  A/)-1  is  bounded,  by  (7.11)  (or  by  the  closed  graph  theorem). 

For  Point  2,  assume  there  exists  a  sequence  as  in  (7.10),  and  suppose  that 
A— XI  had  an  inverse.  Letting  </>n  =  (A— A/)^,  we  have  fjn  =  (A— XI)~1<f>n 
and  so  (7.10)  says  that 

i-  ll^nll  n 

lim  TTTVi - x  rxi  i -  =  R 

n^-oo  ||  (A  —  XI)  1<fin  | 

which  shows  that  ( A  —  A/)-1  is  actually  unbounded.  Thus,  A  —  XI  cannot 
have  a  bounded  inverse. 

Conversely,  if,  for  some  A  G  R,  no  such  sequence  exists,  then  there  exists 
some  £  >  0  such  that 

|| (A  —  XI)fj\\  >  e  H^ll  (7.13) 

for  all  fj  G  H.  Then  A  —  XI  is  injective  and  Proposition  7.3  tells  us  that 
the  range  of  the  self-adjoint  operator  A  —  XI  is  dense  in  H.  Arguing  as  in 
the  preceding  paragraphs  with  (7.13)  in  place  of  (7.11),  we  can  see  that  the 
range  of  A  —  XI  is  also  closed,  hence  all  of  H.  This  shows  that  A  —  XI  has 
an  inverse.  ■ 


Example  7.9  Let  H  =  L2([0, 1])  and  let  A  be  the  operator  on  H  defined 
by 

(Aij))(x)  =  xip(x). 

Then  this  operator  is  bounded  and  self-adjoint,  and  its  spectrum  is  given  by 


a(A)  =  [0, 1] 


As  we  have  already  noted  in  Sect.  6.1,  the  operator  A  does  not  have  any 
(true)  eigenvectors. 

Proof.  It  is  apparent  that  ||A^>||  <  1 0 1  and  that  (0.  Aip)  =  040,  0)  for  all 
4>,ip  e  H,  so  that  A  is  bounded  and  self-adjoint.  Given  A  G  (0, 1),  consider 

the  functions  if>n  :=  lpa+i/n]?  which  satisfy  W'ipnW  =  1  /n.  On  the  other 
hand,  since  \x  —  A|  <  1/n  on  [A,  A  +  1/n],  we  have 
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Thus,  by  Proposition  7.7,  A  belongs  to  the  spectrum  of  A.  Since  this  holds 
for  all  A  G  (0,  1)  and  the  spectrum  of  A  is  closed,  cr  ( A)  Z)  [0, 1]. 

Meanwhile,  if  A  ^  [0,1],  then  the  function  l/(x  —  A)  is  bounded  on 
[0, 1],  and  so  A  —  XI  has  a  bounded  inverse,  consisting  of  multiplication  by 
l/(x  —  A) .  Thus,  cr(A)  =  [0,1].  ■ 


7.2  Spectral  Theorem  for  Bounded  Self-Adjoint 
Operators,  I 

7.2.1  Spectral  Subspaces 

Given  a  bounded  (for  now)  self-adjoint  operator  A ,  we  hope  to  associate 
with  each  Borel  set  F  C  cr(A)  a  closed  subspace  Ve  of  H,  where  we  think 
intuitively  that  Ve  is  the  closed  span  of  the  generalized  eigenvectors  for  A 
with  eigenvalues  in  E.  [We  could  do  this  more  generally  for  any  E  C  R, 
but  we  do  not  expect  any  contribution  from  M\<r(A).]  We  would  expect  the 
collection  of  these  subspaces  to  have  the  following  properties. 

1.  Va(A)  =  H  and  V0  =  {0}. 

2.  If  E  and  F  are  disjoint,  then  Vg  1  Vp. 

3.  For  any  E  and  F,  Vedf  =  Ve  H  Vf- 

4.  If  Fi,  F2, . . .  are  disjoint  and  F  =  U jEj,  then 


h  =  ®A- 


J 


5.  For  any  F,  Ve  is  invariant  under  A. 


6.  If  E  C  [Ao  —  £,  Aq  +  e]  and  'ip  Ve,  then 


|| (A  —  A0/)^||  <  e  W'tp 


The  condition  Va(A)  =  H  captures  the  idea  that  our  generalized  eigenvec¬ 
tors  should  span  H,  while  Property  2  captures  the  idea  that  our  generalized 
eigenvectors  should  have  some  sort  of  orthogonality  for  distinct  eigenval¬ 
ues,  even  if  they  are  not  actually  in  the  Hilbert  space.  In  Property  4,  there 
may  be  infinitely  many  of  the  Fj’s,  in  which  case,  the  direct  sum  is  in  the 
Hilbert  space  sense  (Definition  A. 45).  Properties  5  and  6  capture  the  idea 
that  Ve  is  made  up  of  generalized  eigenvectors  for  A  with  eigenvalues  in  F. 
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7. 2. 2  Projection-  Valued  Measures 

It  is  convenient  to  describe  closed  subspaces  of  a  Hilbert  space  H  in  terms  of 
the  associated  orthogonal  projection  operators.  Recall  (Proposition  A. 57) 
that,  given  a  closed  subspace  T  of  H,  there  exists  a  unique  bounded  op¬ 
erator  P  that  equals  the  identity  on  V  and  equals  zero  on  the  orthogonal 
complement  V1-  of  V.  This  operator  is  called  the  orthogonal  projection 
onto  V  and  satisfies  P2  =  P  and  P*  =  P.  The  following  definition  ex¬ 
presses  the  first  four  properties  of  our  spectral  subspaces — the  ones  that 
do  not  involve  the  operator  A — in  terms  of  the  corresponding  orthogonal 
projections.  Since  those  properties  are  similar  to  those  of  a  measure,  we 
use  the  term  projection- valued  measure. 

Definition  7.10  Let  X  be  a  set  and  Ll  a  a -algebra  in  X .  A  map  p  :  Ll 
B(U)  is  called  a  projection- valued  measure  if  the  following  properties 
are  satisfied. 

1.  For  each  E  £  Ll,  g{E)  is  an  orthogonal  projection. 

2.  p(0)  =  0  and  g(X)  =  I . 

3.  If  Pi,  E2,  P3,  •  •  •  in  Q,  are  disjoint ,  then  for  all  v  £  H,  we  have 


where  the  convergence  of  the  sum  is  in  the  norm  topology  on  H. 

4 .  For  all  Pi,p2  £  we  have  p{E\  H  E2)  =  p(Ei)p(E2). 

Note  that  if  E\  and  P2  are  disjoint,  then  Properties  2  and  4  tell  us 
that  p(Ei)p(E2)  =  0,  from  which  it  follows  (Exercise  10)  that  the  range 
of  p(Ei)  and  the  range  of  //(E^)  are  perpendicular.  It  is  then  not  hard  to 
verify  that  p(Ei)p(E2)  is  the  projection  onto  the  intersection  of  the  ranges 
of  p(Ei)  and  //(E^)  (Exercise  11).  Thus,  if  we  define,  for  each  E  £  11,  a 
closed  subspace  Ve  :=  Range(/i(E)),  then  the  collection  of  Ve’ s  satisfy  the 
first  four  properties  that  we  anticipated  for  spectral  subspaces. 

In  the  next  subsection,  we  will  associate  a  projection-valued  measure  pA 
with  each  bounded  self-adjoint  operator  A.  In  that  case,  the  projection 
pA(E)  will  be  thought  of  as  a  projection  onto  the  spectral  subspace  cor¬ 
responding  to  E.  We  are  about  to  introduce  the  notion  of  operator- valued 
integration  with  respect  to  a  project  ion- valued  measure.  In  the  case  of  the 
project  ion- valued  measure  pA  associated  with  A ,  this  operator-valued  in¬ 
tegral  will  be  the  functional  calculus  for  A. 

Observe  that,  for  any  project  ion- valued  measure  g  and  jj  £  H,  we  can 
form  an  ordinary  (positive)  real- valued  measure  pg  by  setting 

^(E)  =  (ip,  ii(E)ip) 


(7.14) 
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for  all  E  E  D.  This  observation  provides  a  link  between  integration  with 
respect  to  a  projection-valued  measure  and  integration  with  respect  to  an 
ordinary  measure. 

Proposition  7.11  (Operator- Valued  Integration)  Let  0  be  a  a -alge¬ 
bra  in  a  set  X  and  let  p  :  Q  B( H)  be  a  projection- valued  measure.  Then 
there  exists  a  unique  linear  map,  denoted  f  f  dp,  from  the  space  of 

bounded,  measurable,  complex- valued  functions  on  O  into  H)  with  the 
property  that 

(*•  (L 

for  all  f  and  all  jj  E  H,  where  pp  is  given  by  (7.14)-  This  integral  has  the 
following  additional  properties. 

1.  For  all  E  E  Q,  we  have 


f  dp)  fj)  =  f  dp p 


(7.15) 


1  e  dp  =  p(E). 


In  particular,  the  integral  of  the  constant  function  1  is  I . 

2.  For  all  f ,  we  have 


<  sup  |/(A) 
Aex 


3.  Integration  is  multiplicative:  For  all  f  and  g,  we  have 


j.  For  all  f ,  we  have 


(7.16) 


(7.17) 


In  particular,  if  f  is  real-valued,  then  fx  f  dp  is  self-adjoint. 

By  Property  1  and  linearity,  integration  with  respect  to  p  has  the  ex¬ 
pected  behavior  on  simple  functions.  It  then  follows  from  Property  2  that 
the  integral  of  an  arbitrary  bounded  measurable  function  /  can  be  computed 
as  follows.  Take  a  sequence  sn  of  simple  functions  converging  uniformly  to 
/;  the  integral  of  /  is  then  the  limit,  in  the  operator  norm  topology,  of  the 
integral  of  the  sn’s. 

Although  the  multiplicative  property  of  the  integral  may  seem  surprising 
at  first,  observe  that  for  any  E\,E<i  E  Ll,  Property  3  in  Definition  7.10  tells 
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us  that 


lEl  dtp J  I  J  1  e2  dn  )  =  n(Ei)(i(E2)  =  n{E\  n  E2) 


1  *  1  e*2  d  g*  • 


X 


Thus,  multiplicativity  of  the  integral  at  the  level  of  indicator  functions  is 
built  into  the  definition  of  a  project  ion- valued  measure. 

If  one  wanted  to  make  a  real- valued  measure  for  which  the  corresponding 
integral  was  multiplicative,  then  since  1#  •  1#  =  1#,  the  integral  of  1  e — 
namely,  /jl(E) — would  have  to  satisfy  n(E)2  =  /i(E).  This  would  mean 
that  /jl(E)  is  0  or  1  for  all  E.  For  such  measures,  one  would  indeed  obtain 
multiplicativity  of  the  integral,  but  measures  with  this  property  are  not 
very  interesting.  For  operator- valued  measures,  we  can  have  interesting 
examples  where  the  integral  is  multiplicative,  simply  because  there  are 
many  more  idempotents  (elements  A  with  A2  =  A)  in  £>( H)  than  in  R. 
Proof  of  Proposition  7.11.  Given  a  project  ion- valued  measure  /i  and  a 
bounded  measurable  function  /  on  X,  define  a  map  Qf  :  H  C  by 


f  dfjjn/j. 


where  is  given  by  (7.14).  If  /  is  an  indicator  function,  then  Qfigp)  = 
(^,  n(E)ij))  is  a  bounded  quadratic  form.  (See  Definition  A. 60.)  It  is  straight¬ 
forward  to  show,  passing  from  indicator  functions  to  simple  functions  and 
then  to  general  functions,  that  for  any  bounded  measurable  /,  Qf  is  a 
bounded  quadratic  form,  with 


<3/ (VO  I  < 


sup  |/ (A) 
Tex 


(7.18) 


It  then  follows  from  Proposition  A. 63  that  there  is  a  unique  bounded 
operator  Af  such  that 

<2/ (VO  =  (VAX) 

for  all  i/j  G  H.  We  set  fxf  dgi  =  Af.  From  the  way  Af  is  defined,  it 
satisfies  (7.15).  The  uniqueness  of  the  linear  map  /  Jx  f  d/i  follows 
from  the  uniqueness  in  Proposition  A. 63. 

If  /  =  Ie->  then  Qf{ ip)  =  /jL^(E)  =  (^,  n(E)ij)) ,  in  which  case  the  unique 
associated  operator  Af  is  n(E).  This  establishes  Property  1.  Property  2 
follows  from  (7.18). 

For  Property  3,  we  have  already  observed  that  multiplicativity  of  the 
integral,  at  the  level  of  indicator  functions,  is  built  into  the  definition  of  a 
project  ion- valued  measure.  Since  both  sides  of  (7.17)  are  bilinear  in  (</>,  ^), 
we  have  (7.17)  for  simple  functions.  Using  Property  2,  we  can  then  ob¬ 
tain  (7.17)  for  all  bounded  measurable  functions  by  taking  limits. 
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Finally,  if  /  is  real  valued,  then  Qf{ if)  will  be  real  for  all  ^gH.  Thus,  by 
Proposition  A. 63,  the  associated  operator  Af  will  be  self-adjoint.  Property 
4  then  follows  by  linearity.  ■ 


7.2.3  The  Spectral  Theorem 

We  are  ready  to  state  one  version  of  the  spectral  theorem  for  bounded 
self-adjoint  operators. 

Theorem  7.12  (Spectral  Theorem,  First  Form)  If  A  £  B{ H)  is  self- 
adjoint,  then  there  exists  a  unique  projection- valued  measure  fiA  on  the 
Borel  a -algebra  in  cr(A),  with  values  in  projections  on  H,  such  that 


A  dfiA(\)  =  A. 


(7.19) 


Since  the  spectrum  a  (A)  of  A  is  bounded,  the  function  /(A)  :=  A  is 
bounded  on  cr(A).  The  proof  of  this  theorem  is  given  in  Chap.  8. 


Definition  7.13  (Functional  Calculus)  If  A  £  B( H)  is  self-adjoint  and 
f  ■  °(A)  -a  C  is  a  bounded  measurable  function,  define  an  operator  f(A) 
by  setting 

f(A)  =  [  /(A)  d/A A), 

Jcr(A) 

where  pA  is  the  projection- valued  measure  in  Theorem  7.12. 

We  may  extend  the  project  ion- valued  measure  pA  from  cr  (A)  to  all  of 
R  by  assigning  measure  0  to  R  \  cr(A).  Then,  roughly  speaking,  f(A)  is 
the  operator  that  is  equal  to  /(A)/  on  the  range  of  the  projection  operator 
/iA([A,A  +  dA)). 

Since  the  integral  with  respect  to  pA  is  multiplicative,  it  follows  from 
(7.19)  that  if  /(A)  =  Am  for  some  positive  integer  ra,  then  f(A)  is  the 
rath  power  of  A.  Further,  since  the  series  eaX  =  X^m=o(a^)m/m-  converges 
uniformly  on  the  compact  set  cr(A),  the  operator  eaA  (computed  using  the 
functional  calculus  for  the  function  /(A)  =  eaX)  may  be  computed  as  a 
power  series. 

Definition  7.14  (Spectral  Subspaces)  For  A  E  B(  H),  let  fiA  be  the 

associated  projection- valued  measure,  extended  to  be  a  measure  on  R  by 
setting  pA(R  \  a  (A))  =  0.  Then  for  each  Borel  set  E  C  R,  define  the 

spectral  subspace  Ve  of  H  by 

Ve  =  Rang  e(fiA(E)). 


The  definition  of  a  project  ion- valued  measure  implies  that  these  spectral 
subspaces  satisfy  the  first  four  properties  listed  in  Sect.  7.2.1.  We  now  show 
that  (7.19)  implies  the  remaining  two  properties  we  anticipated  for  the 
spectral  subspaces. 
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Proposition  7.15  If  A  E  £>( H)  is  self-adjoint,  the  spectral  subspaces  as¬ 
sociated  with  A  have  the  following  properties. 

1.  Each  spectral  subspace  VE  is  invariant  under  A. 

2.  If  E  c  [Ao  —  £,  Ao  +  e\  then  for  all  ^  G  Vg,  we  have 

1104.  -  A0J)^||  <  £  ||0 


3.  The  spectrum  of  A\Ve  is  contained  in  the  closure  of  E. 

4-  If  Ao  is  in  the  spectrum  of  A,  then  for  every  neighborhood  U  of  Ao, 
we  have  Vjj  ^  {0},  or,  equivalently,  fi{U)  ^  0. 

Proof.  For  Point  1,  observe  that  for  any  bounded  measurable  functions  / 
and  g  on  cr(A),  the  operators  f(A)  and  g(A)  commute,  since  the  product 
in  either  order  is  equal  to  the  integral  of  the  function  fg  =  gf  with  respect 
to  pA.  In  particular,  A,  which  is  the  integral  of  the  function  /(A)  =  A, 
commutes  with  pA(E),  which  is  the  integral  of  the  function  1  E-  Thus, 
given  a  vector  fiA{E)(j)  in  the  range  of  pA(E),  we  have 


A/iA(E)f  =  /jla(E)A(/>, 

which  is  again  in  the  range  of  fiA(E ),  establishing  the  invariance  of  the 
spectral  subspace. 

For  Point  2,  suppose  that  ^  G  Vg,  where  E  C  [Ao  —  £,  Ao  +  e].  Then  if  is 
in  the  range  of  pA(E),  and  so 

(A  -  A0i>  =  [A  -  Ao I)fiA(E)ip. 

But  ^a{E)  =  1  e(A)  and  A  —  Ao  I  =  f(A),  where  /(A)  =  A  —  Ao-  By  the 
multiplicativity  of  the  integral,  then, 

(A~\0m  =  (fiE)(A)ip. 

But  |/(A)1b(A)|  <  £  and  so  by  (7.16),  the  operator  (fig) (A)  has  norm  at 
most  e. 

For  Point  3,  if  Ao  is  not  in  E,  then  the  function  g( A)  :=  1je(A)(1/(A  —  Ao)) 
is  bounded.  Thus,  g(A)  is  a  bounded  operator  and 

g(A)(A  -  Ao I)  =  (A-  A 0I)g(A)  =  1  E(A). 

This  shows  that  the  restriction  to  VE  of  g(A)  is  the  inverse  of  the  restriction 
to  Ve  of  A.  Thus,  Ao  is  not  in  the  spectrum  of  A\v  . 

For  Point  4,  fix  Ao  G  cr(A)  and  suppose  for  some  e  >  0,  we  have  /x((Ao  — 
£,  Aq  +  e))  =  0.  Consider,  then,  the  bounded  function  /  defined  by 


A 


/(A) 


A-A0 

0 


A  —  Ao  |  >  c 
A  —  Aq  <C  £ 
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Since  /(A)  •  (A  —  Ao)  equals  1  except  on  (Ao  —  £,  Ao  +  £),  the  equation 
/(A)  •  (A  —  Ao)  =  1  holds  //-almost  everywhere.  Thus,  the  integral  of  this 
function  coincides  with  the  integral  of  the  constant  function  1,  which  is  I. 
Since  the  integral  is  multiplicative,  we  see  that 

f(A)(A  -  A0/)  =  (A-  A oI)f(A)  =  I, 

showing  that  the  bounded  operator  f(A )  is  the  inverse  of  ( A  —  Ao I).  This 
contradicts  the  assumption  that  Ao  ecx(A).  ■ 

Proposition  7.16  If  A  G  S(H)  is  self- adjoint  and  B  e  B{ H)  commutes 
with  A,  the  following  results  hold. 

1.  For  all  bounded  measurable  functions  f  on  cr(A),  the  operator  f(A) 
commutes  with  B. 


2.  Each  spectral  subspace  for  A  is  invariant  under  B. 

The  proof  of  this  proposition  is  deferred  until  Chap.  8.  We  conclude  this 
section  by  fulfilling  (at  least  for  bounded  self-adjoint  operators)  one  of 
the  goals  of  the  spectral  theorem,  namely  to  give  a  probability  measure 
describing  the  probabilities  for  measurements  of  a  self-adjoint  operator  A 
in  the  state  ip. 

Proposition  7.17  Suppose  A  G  B( H)  is  self-adjoint  and  ^  G  H  is  a  unit 
vector.  Then  there  exists  a  unique  probability  measure  tiA  on  R  such  that 


for  all  non-negative  integers  m. 

We  will  prove  a  version  of  Proposition  7.17  for  unbounded  self-adjoint 
operators  in  Chap.  9.  In  the  unbounded  case,  however,  we  will  not  obtain 
uniqueness  of  the  probability  measure,  even  if  if  is  in  the  domain  of  A771  for 
all  m.  Even  in  the  unbounded  case,  however,  the  spectral  theorem  provides 
a  canonical  choice  of  the  probability  measure. 

Proof.  We  define  a  measure  (iA  on  a  (A)  as  in  Sect.  7.2.2  by 

vt(E)  =  (<P,HA(E)ip)  ■ 

The  properties  of  integration  with  respect  to  pA  then  tell  us  that 


(ip,Ami{;)  =  / ip,  (  f  Am  dfxA( A)  ]  f)  =  [  Am  d/j,A( A). 

\  \JAA)  )  /  Jo(A) 

We  then  extend  pA  to  R  by  setting  it  equal  to  zero  on  R\cr(A),  establishing 
the  existence  of  the  desired  probability  measure  on  R.  Since 


(ip,AmiP) |  <  HV’H2  pm||  <  HV’H2  \\A 


m 
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the  moments  grow  only  exponentially  with  m.  Thus,  standard  uniqueness 
results  for  the  moment  problem  (e.g.,  Theorem  8.1  in  Chap.  4  of  [18])  give 
the  uniqueness  of  fi^.  ■ 

7.3  Spectral  Theorem  for  Bounded  Self-Adjoint 
Operators,  II 

As  we  have  already  noted  in  Sect.  6.5,  one  version  of  the  spectral  theorem 
asserts  that  every  self-adjoint  operator  is  unitarily  equivalent  to  a  multi¬ 
plication  operator.  In  the  case  of  a  bounded  self-adjoint  operator  A ,  on  a 
separable  Hilbert  space  H,  this  result  means  that  A  is  unitarily  equiva¬ 
lent  to  the  operator  Mh  on  L2(X,  /x),  where  (X, /x)  is  a  cr- finite  measure 
space,  h  is  a  measurable,  real-valued  function,  and  Mh  is  the  operator  of 
multiplication  by  h : 

(Mm/0(  A)  =  h(  A)^(A). 

Although  the  “multiplication  operator”  form  of  the  spectral  theorem 
(Theorem  7.20)  has  the  advantage  of  being  easy  to  state,  there  is  an  even 
better  version  involving  the  concept  of  a  direct  integral.  It  is  straightforward 
to  extend  the  notion  of  an  L2  space  to  an  L2  space  with  values  in  a  Hilbert 
space  H.  In  a  direct  integral,  we  extend  the  concept  one  step  further,  by 
allowing  the  Hilbert  space  to  depend  on  the  point.  We  begin  with  a  measure 
space  (X,  fi)  and  then  have  one  Hilbert  space  Ha  for  each  A  in  X.  An 
element  of  the  direct  integral  is  a  function  s  on  X  such  that  s(A)  belongs 
to  Ha  for  each  AgI.  Given  a  real-valued  measurable  function  h  on  X,  it 
makes  sense  to  multiply  an  element  s  of  the  direct  integral  by  h. 

The  direct  integral  form  of  the  spectral  theorem  says  a  bounded  self- 
adjoint  operator  A  is  unitarily  equivalent  to  a  multiplication  operator  on  a 
direct  integral.  By  extending  multiplication  operators  to  the  more  general 
setting  of  direct  integrals  (instead  of  just  ordinary  L2  spaces),  we  gain  sev¬ 
eral  benefits.  First,  the  set  X  and  the  function  h  become  canonical:  The 
set  X  is  simply  the  spectrum  of  A  and  the  function  h  is  simply  h( A)  =  A. 
Second,  the  direct  integral  approach  carries  with  it  a  notion  of  “generalized 
eigenvectors,”  since  the  space  Ha  can  be  thought  of  as  the  space  of  gener¬ 
alized  eigenvectors  with  eigenvalue  A.  (The  spaces  Ha  are  not,  in  general, 
contained  in  the  direct  integral  Hilbert  space.  Thus,  direct  integrals  give  a 
rigorous  meaning  to  the  idea  of  “eigenvectors”  that  are  not  in  the  Hilbert 
space  on  which  the  operator  acts.)  Third,  the  direct  integral  approach  gives 
a  simple  way  to  classify  self-adjoint  operators  up  to  unitary  equivalence: 
Two  self-adjoint  operators  are  unitarily  equivalent  if  and  only  if  their  direct 
integral  representations  are  equivalent  in  a  natural  sense  (Proposition  7.24). 

If  one  really  wants  the  simplicity  of  the  (ordinary)  multiplication  operator 
version  of  the  spectral  theorem,  it  is  a  simple  matter  to  prove  this  result 
using  precisely  the  same  methods  as  in  the  proof  of  the  direct  integral 
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version.  (See  Theorem  7.20.)  Nevertheless,  the  direct  integral  version  is, 
arguably,  the  most  definitive  version  of  the  spectral  theorem  for  a  single 
self-adjoint  operator. 

We  turn  now  to  the  definition  of  a  direct  integral.  Suppose  fi  is  a  cr-finite 
measure  on  a  a- algebra  O  of  sets  in  X.  Suppose  also  that  for  each  A  G  I, 
we  have  a  separable  Hilbert  space  HA  with  inner  product  (•,  -)A.  We  want 
to  define  the  direct  integral  of  the  HA’s  with  respect  to  fi.  Elements  of  the 
direct  integral  will  be  sections  s,  meaning  that  s  is  a  function  on  X  with 
values  in  the  union  of  the  HA’s,  having  the  property  that 

s(A)  G  Ha 


for  each  A  in  X.  We  would  like  to  define  the  norm  of  a  section  s  by  the 
formula 

2 


=  (  (s(A),s(A))Ad/z(A), 

J  X 


provided  that  the  integral  on  the  right-hand  side  is  finite.  The  inner  product 
of  two  sections  s\  and  S2  (with  finite  norm)  should  then  be  given  by  the 
formula 


(Sl,«2> 


(si(A),  s2(A))a  d/x(A). 


The  problem  with  this  description  of  the  norm  and  inner  product  on 
the  direct  integral  is  that  we  have  not  said  anything  about  measurability. 
As  things  stand,  it  does  not  make  sense  to  ask  whether  a  section  s  is 
measurable,  since  the  space  in  which  s(A)  takes  its  values  is  different  for 
each  A.  We  must,  therefore,  introduce  some  additional  structure  that  gives 
rise  to  a  notion  of  measurability.  (The  measurability  issue  is  a  technicality 
that  can  be  ignored  on  a  first  reading.) 

One  way  to  address  the  measurability  issue  is  to  choose  a  simultaneous 
orthonormal  basis  for  each  of  the  Hilbert  spaces  HA.  To  deal  with  the 
possibility  that  different  spaces  can  have  different  dimensions,  we  slightly 
modify  the  concept  of  an  orthonormal  basis.  We  say  that  a  family  {ey }  of 
vectors  is  an  orthonormal  basis  for  a  Hilbert  space  H  if  (ej,ek)  =  0  for 
j  fc,  the  norm  of  each  ej  is  either  0  or  1,  and  the  closure  of  the  span 
of  the  efis  is  ah  of  H.  This  just  means  that  we  allow  some  of  the  vectors 
in  our  basis  to  be  zero,  with  the  nonzero  vectors  forming  an  orthonormal 
basis  in  the  usual  sense. 

We  now  define  a  simultaneous  orthonormal  basis  for  a  family  {HA}  of 
separable  Hilbert  spaces  to  be  a  collection  {ey  (•) °f  sections  with  the 
property  that  for  each  A,  {eJ  (A)}^N1  is  an  orthonormal  basis  for  HA.  Pro¬ 
vided  that  the  function  A  i-t  dim  HA  is  a  measurable  function  from  X  into 
[0,oo],  it  is  possible  to  choose  a  simultaneous  orthonormal  basis 
such  that  (ej(A),  efc(A))  is  measurable  for  all  j  and  k.  Having  chosen  a  si¬ 
multaneous  orthonormal  basis  with  this  property,  we  define  a  section  s  to 
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be  measurable  if  the  function 

A  ha  (ej( A),  s(A))a 

is  a  measurable  complex- valued  function  for  each  j.  Our  assumption  on  the 
s  means  that  the  s  themselves  are  measurable  sections. 

We  refer  to  a  choice  of  simultaneous  orthonormal  basis,  chosen  so  that 
(ej( A),e/C(A))  is  measurable,  as  a  measurability  structure  on  the  collection 
of  Ha’s.  Given  two  measurable  sections  s\  and  52,  the  function 

00 

A  HA  (51(A),  52(A))a  =  ^2  ejW)  A  ^2  (A) )  A 

3  = 1 


is  also  measurable. 

Definition  7.18  Suppose  the  following  structures  are  given :  (1)  a  a -finite 
measure  space  (X,  D,/i),  (2)  a  collection  {Ha}agx  of  separable  Hilbert 
spaces  for  which  the  dimension  function  is  measurable,  and  (3)  a  mea¬ 
surability  structure  on  {Ha}agx-  Then  the  direct  integral  of  the  Ha’s 
with  respect  to  fi,  denoted 


Ha  dfi{ A), 


is  the  space  of  equivalence  classes  of  almost- everywhere- equal  measurable 
sections  s  for  which 


(5(A),  5(A)) A  dg(X)  <  00. 


The  inner  product  (51, 52)  of  two  such  sections  s  1  and  S2  is  given  by  the 
formula 

(si,s2):=  /  (si(A),s2(A))a  dn( A). 

Jx 

To  see  that  the  integral  defining  the  inner  product  of  two  finite-norm 
sections  is  finite,  note  that  |(5i(A),  52(A))a|  <  || «si (A) || A  ||s2(A)||a.  By  as¬ 
sumption,  ||sj(A)||A  is  a  square-integrable  function  of  A  for  j  =  1,2,  and 
the  product  of  two  square-integrable  functions  is  integrable.  Thus,  the  inte¬ 
grand  in  the  definition  of  (51, 52)  is  also  integrable.  It  is  not  hard  to  show, 
using  an  argument  similar  to  the  proof  of  completeness  of  L 2  spaces,  that 
a  direct  integral  of  Hilbert  spaces  is  a  Hilbert  space. 

Let  us  think  of  two  important  special  cases  of  the  direct  integral  con¬ 
struction.  First,  if  each  of  the  Ha’s  is  simply  C,  then  the  direct  integral 
(with  the  obvious  measurability  structure)  is  simply  L2(X,  /T).  Second,  sup¬ 
pose  that  X  =  {Ai,  A2,  •  •  •}  is  countable,  Q  is  the  cr-algebra  of  all  subsets 
of  X ,  and  fi  is  the  counting  measure  on  X.  Then  the  direct  integral  is  the 
Hilbert  space  direct  sum  (Definition  A. 45). 
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Given  a  direct  integral,  suppose  we  have  some  Ao  E  X  for  which  { Ao } 
is  measurable  and  such  that  c  :=  /i({Ao})  >  0.  Then  we  can  embed  Ha0 
isometrically  into  the  direct  integral  by  mapping  each  ip  E  Ha0  to  the 
section  s  given  by 


1,  A^A„ 


Even  if  //({Ao})  =  0,  we  may  still  think  that  Ha0  is  a  sort  of  “generalized 
subspace”  of  the  direct  integral. 


Theorem  7.19  (Spectral  Theorem,  Second  Form)  If  A  e  B( H)  is 

self-adjoint,  then  there  exists  a  a -finite  measure  //  on  cr(A),  a  direct  in¬ 
tegral 


Ha 


and  a  unitary  map  U  between  H  and  the  direct  integral  such  that 


UAU~1(s)\  (A)  =  As(A) 


(7.20) 


for  all  sections  s  in  the  direct  integral. 


The  proof  of  Theorem  7.19  is  given  in  the  next  chapter,  along  with  the 
proof  of  our  first  version  of  the  spectral  theorem.  In  the  meantime,  let  us 
think  about  what  this  version  of  the  spectral  theorem  is  saying.  We  may 
think  that  the  unitary  map  U  is  an  identification  of  our  original  Hilbert 
space  H  with  a  certain  direct  integral  over  the  spectrum  of  A.  Under  this 
identification,  the  self-adjoint  operator  A  becomes  the  operator  of  multi¬ 
plication  by  A,  that  is,  the  map  sending  the  section  s(A)  to  As  (A).  Roughly 
speaking,  then,  the  operator  A  acts  (under  our  identification)  as  XI  on 
each  space  Ha-  Thus,  we  may  think  of  Ha  as  being  something  like  an 
“eigenspace”  for  A,  for  each  element  A  of  the  spectrum  of  A.  Of  course, 
unless  //({A})  >  o,  the  Hilbert  space  Ha  is  not  actually  contained  in  H. 
Nevertheless,  we  may  think  of  elements  of  a  given  Ha  as  “generalized  eigen¬ 
vectors”  for  the  operator  A. 

The  direct  integral  formulation  of  the  spectral  theorem  leads  readily  to  a 
classification  result  for  bounded  self-adjoint  operators.  See  Proposition  7.24 
later  in  this  section.  Meanwhile,  as  we  noted  earlier  in  this  section,  the 
method  of  proof  for  Theorem  7.19  also  yields  a  version  of  the  spectral 
theorem  involving  multiplication  operators  on  ordinary  L2  spaces. 

Theorem  7.20  (Spectral  Theorem,  Multiplication  Operator  Form) 

Suppose  A  E  B(  H)  is  self-adjoint.  Then  there  exists  a  cr-finite  measure 
space  a  bounded,  measurable,  real-valued  function  h  on  X,  and  a 

unitary  map  U  :  H  L2(X,/i)  such  that 

[uAu-imo)  =  h(\m\) 

for  all  ip  E  L2(X,  //). 
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We  return  now  to  a  discussion  of  the  direct  integral  version  of  the  spectral 
theorem.  This  version  gives  a  simple  description  of  the  functional  calculus. 

Proposition  7.21  Suppose  A  e  B{ H)  is  self-adjoint  and  U  is  a  unitary 
map  as  in  Theorem  7.19.  Then  for  any  bounded  measurable  function  f  on 
cr(A),  we  have 

[C//(^)C/-1(S)](  A)  =  /(A)s(A). 

Thus,  roughly  speaking,  f(A)  is  defined  to  be  /(A)/  on  each  “generalized 
eigenspace”  H^.  Proposition  7.21  follows  directly  from  (7.20)  if  /  is  a  poly¬ 
nomial;  the  result  for  continuous  /  then  follows  by  taking  uniform  limits. 
The  result  for  general  /  is  then  easily  established  by  using  the  limiting 
arguments  of  Chap.  8,  especially  Exercise  3. 

Let  us  now  consider  what  sort  of  uniqueness  there  should  be  in  the  second 
version  of  the  spectral  theorem.  There  is  a  “trivial”  source  of  nonuniqueness 
coming  from  the  possibility  that  some  of  the  H^’s  may  have  dimension  0. 
Let  Eq  denote  the  set  of  A  for  which  dim  =  0.  Even  if  p(Eo)  >  0,  the  set 
Eq  makes  no  contribution  to  the  norm  of  a  section,  since  every  section  is 
automatically  zero  on  Eq.  Thus,  we  may  define  a  new  measure  p  by  setting 
p(E)  =  /x(TnTo),  so  that  fi  agrees  with  p  on  Eq  but  is  zero  on  Eq.  Then 
the  direct  integrals  of  the  H^’s  with  respect  to  p  and  with  respect  to  fi  are 
“indistinguishable.”  Thus,  we  can  always  modify  a  direct  integral  so  as  to 
assume  that  dimH^  >  0  for  almost  every  A. 

Meanwhile,  unlike  the  projection- valued  measure  pA  in  Theorem  7.12, 
the  measure  p  in  Theorem  7.19  is  not  unique,  but  only  unique  up  to  equiva¬ 
lence,  where  two  cr-finite  measures  on  a  given  measurable  space  are  equiva¬ 
lent  if  they  have  precisely  the  same  sets  of  measure  zero.  For  a  given  measure 
p,  the  Hilbert  spaces  are  unique  only  up  to  unitary  equivalence,  mean¬ 
ing  that  only  the  dimension  of  the  spaces  is  uniquely  determined.  Even 
the  dimension  of  is  uniquely  determined  only  up  to  a  set  of  /i-measure 
zero.  As  it  turns  out,  the  sources  of  nonuniqueness  in  this  paragraph  and 
the  previous  paragraph  are  ah  that  exist. 

Proposition  7.22  (Uniqueness  in  Theorem  7.19)  Suppose  A  E  m) 
is  self-adjoint  and  consider  two  different  direct  integrals  as  in  Theorem  7.19, 
one  with  measure  p^  and  Hilbert  spaces  and  the  other  with  mea¬ 
sure  p^  and  Hilbert  spaces  If  dim  >  0  for  p^  -almost  every  A 

(j  =  1,2),  then  p ^  and  p ^  are  mutually  absolutely  continuous  and 

dim  =  dimH^ 
for  p^  -almost  every  A  (j  =  1,2). 

See  the  end  of  the  next  chapter  for  a  sketch  of  the  proof  of  this  uniqueness 
result. 

Theorem  7.19  should  be  thought  of  as  a  refinement  of  our  earlier  form 
(Theorem  7.12)  of  the  spectral  theorem,  in  the  sense  that  we  can  easily 
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recover  Theorem  7.12  from  Theorem  7.19.  In  the  setting  of  Theorem  7.19, 
and  given  a  measurable  set  E  C  cr(A),  let  Ve  denote  the  space  of  (equiv¬ 
alence  classes)  of  sections  s  that  are  supported  on  E ,  that  is,  for  which 
s(A)  =  0  for  p- almost  every  A  in  Ec.  This  is  easily  seen  to  be  a  closed 
subspace.  Let  Pe  denote  the  orthogonal  projection  onto  Ve,  and  define 

pA{E)  =  U~XPEU.  (7.21) 

It  is  straightforward  to  check  that  pA  is  a  project  ion- valued  measure  on 
cr(A),  with  values  in  6(H),  and  that  A  dpA( A)  =  A. 

Note  that  both  versions  of  the  spectral  theorem  for  A  involve  a  measure, 
the  first,  denoted  fiA ,  being  a  project  ion- valued  measure,  and  the  second, 
denoted  p,  being  an  ordinary  measure  with  values  in  the  non-negative  real 
numbers.  The  following  result  shows  the  relationship  between  the  two  mea¬ 
sures. 

Proposition  7.23  Suppose  A  E  8(H)  is  self-adjoint,  pA  is  the  projection¬ 
valued  measure  given  by  Theorem  7.12  and  p  is  a  real-valued  measure  as 
in  Theorem  7.19.  If  dim  Ha  >  0  for  p-almost  every  X,  then  for  any  Borel 
set  E  C  cr(A),  pA{E )  =  0  if  and  only  if  p(E)  =  0. 

Of  course,  the  0  in  the  expression  pA(E)  =  0  is  the  zero  operator ,  whereas 
the  0  in  the  expression  p(E)  =  0  is  the  number  0.  Nevertheless,  we  may 
think  of  Proposition  7.23  as  saying  that  pA  and  p  are  equivalent  in  the 
usual  measure-theoretic  sense,  having  precisely  the  same  sets  of  measure 
zero. 

Proof.  As  we  have  remarked,  given  a  direct  integral  as  in  Theorem  7.19, 
we  can  construct  a  projection- valued  measure  by  means  of  (7.21),  and  this 
projection- valued  measure  satisfies  A  dpA( A)  =  A.  This  projection¬ 

valued  measure  must  coincide  with  the  one  in  Theorem  7.12,  by  the  unique¬ 
ness  in  that  theorem. 

Now,  if  p{E)  =  0,  then  any  section  supported  on  E  is  zero  almost  every¬ 
where  and  thus  represents  the  zero  element  of  the  direct  integral.  In  that 
case,  Ve  —  0  and  so  pA(E)  =  0  by  (7.21).  In  the  other  direction,  suppose 
KE)  0.  Since  p  is  cr-finite,  E  will  contain  a  measurable  subset  F  such 
that  0  <  KF)  <  oo.  Then  let  s  be  the  section  given  by 

OO  1 

s(x)  =  EEej{x) 

3  = 1 

for  A  E  F  and  5 (A)  =  0  for  A  E  Fc ,  where  {ej(-)}  is  our  measurability 
structure  for  the  direct  integral.  Then 

(s(A),ej(A))A  =  L  (ej(\),ej(X))x  1F(A), 

which  is  a  measurable  function  of  A  for  all  j,  so  that  s  is  measurable.  Since 
we  assume  that  H^  has  nonzero  dimension  for  p- almost  every  A,  s  will  be 
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nonzero  almost  everywhere  on  F  and  thus  will  have  positive  norm.  The 
norm  of  s  is  finite  because  ||s(A)||  <  1  and  F  has  finite  measure.  Thus, 
Ve  7^  0  and  pA(E)  ^  0.  ■ 

We  say  that  self-adjoint  operators  A\  and  A 2  on  Hilbert  spaces  Hi  and 
H2  are  unitarily  equivalent  if  there  exists  a  unitary  map  U  :  Hi  — >  H2 
such  that 

A2  =  UA 1C"1. 

Using  Proposition  7.22,  we  can  give  a  classification  of  bounded  self-adjoint 
operators  on  separable  Hilbert  spaces  up  to  unitary  equivalence.  For  a  given 
bounded  self-adjoint  operator  A,  we  call  the  function  A  dimH^  the 
multiplicity  function  for  A.  It  is  well  defined  (independent  of  the  choice  of 
direct  integral  decomposition)  up  to  a  set  of  measure  zero.  It  turns  out  that 
bounded  self-adjoint  operators  are  characterized,  up  to  unitary  equivalence, 
by  the  spectrum  of  A  as  a  set,  the  equivalence  class  of  the  measure  p  in 
Theorem  7.19,  and  the  multiplicity  function. 

Proposition  7.24  Suppose  A\  and  A2  are  bounded  self-adjoint  operators 
on  separable  Hilbert  spaces  Hi  and  H2,  respectively.  Choose  direct  integral 
representations  for  A\  and  A2  as  in  Theorem  7.19,  with  the  associated 
measures  p\  and  P2  chosen  so  that  dimH^  >  0  for  pj -almost  every  A 
(j  =  1,2).  Then  A\  and  A2  are  unitarily  equivalent  if  and  only  if  the 
following  conditions  are  satisfied. 

1.  cr(Ai)  =  cr(A2). 

2.  The  measures  p\  and  P2  are  mutually  absolutely  continuous. 

3.  The  multiplicity  functions  of  A\  and  A2  coincide  up  to  a  set  of  mea¬ 
sure  zero. 

See  Exercise  12  for  a  proof  of  this  result. 


7.4  Exercises 

1.  Suppose  A  and  B  are  commuting  linear  operators  on  a  nonzero  finite¬ 
dimensional  vector  space. 

(a)  Show  that  each  eigenspace  for  A  is  invariant  under  B. 

(b)  Show  that  A  and  B  have  at  least  one  simultaneous  eigenvector, 
that  is,  a  nonzero  vector  v  with  Av  =  Xv  and  Bv  =  pv,  for  some 
constants  A,  p  E  C. 

2.  Suppose  that  A  E  B(  H)  is  normal ,  meaning  that  A  A*  =  A*  A.  Sup¬ 
pose  that  for  some  if  E  H  and  A  E  C  we  have  Aif  =  Xif.  Show  that 
A*if  =  Xip. 

Hint:  Compute  1 1  (A*  —  A)gb  1 1 . 
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3.  Suppose  a  closed  subspace  V  of  H  is  invariant  under  a  bounded  oper¬ 
ator  A,  meaning  that  Aip  E  V  for  all  Show  that  the  orthogonal 

complement  V1-  of  V  is  invariant  under  A* . 

4.  (a)  Suppose  that  H  is  a  finite-dimensional  Hilbert  space  over  C  and 

A  is  a  normal  linear  operator  on  H  in  the  sense  of  Exercise  2. 
Show  that  there  exists  an  orthonormal  basis  for  V  consisting  of 
simultaneous  eigenvectors  for  A  and  A*. 

Hint :  Use  Exercises  1  and  3. 

(b)  Suppose  A  is  a  linear  operator  on  a  finite-dimensional  Hilbert 
space  H  over  C  and  suppose  there  exists  an  orthonormal  basis 
for  V  consisting  of  eigenvectors  of  A.  Show  that  A  commutes 
with  A*. 


5.  Suppose  A  E  6( H)  has  an  inverse  A  1  in  6(H).  Show  that  (A  1yA* 
=  A*  (A-1)*  =  I.  Conclude  that  A*  is  invertible  and  (A*)-1=(A-1)*. 

6.  Suppose  U  is  a  unitary  operator  on  H  (Definition  A. 55).  Show  that 
the  spectrum  of  U  is  contained  in  the  unit  circle. 

Hint :  By  writing  U  —  XI  as  (— A)(7  —  U/X)  or  as  U(I  —  A/7-1),  show 
that  any  A  with  |A|  ^  1  is  in  the  resolvent  set  of  A. 

7.  Suppose  that  A  E  6(H)  is  self-adjoint  and  non-negative,  that  is,  that 
A  satisfies  (7.3).  Show  that  the  spectrum  of  A  is  contained  in  the 
interval  [0,  oo). 

Note:  Conversely,  if  A  E  6(H)  is  self-adjoint  and  cr(A)  C  [0,  oo),  then 
A  is  non-negative.  See  Exercise  2  in  Chap.  8. 


8.  Suppose  A  E  m)  is  invertible.  Show  that  there  exists  e  >  0  such 
that  for  all  B  E  6(H)  with  \\B  —  A\\  <  £,  B  is  also  invertible. 

Hint:  Use  a  power  series  argument  as  in  the  proof  of  Proposition  7.5. 

9.  Assume  A  E  6(H)  is  self-adjoint. 

(a)  Suppose  Aq  E  C  is  a  point  in  the  resolvent  set  of  A.  Show  that 


(A  —  A07) 


1 


d(\0,<j(A))  ’ 


where  d(\0,a(A))  =  infA6o.(yl)  |A  -  A0 

Hint:  Think  of  (.4  —  Ao/)_1  as  a  function  of  A  in  the  sense  of 
the  functional  calculus  for  A. 

(b)  Given  Ao  G  C,  suppose  that  there  exists  some  nonzero  ip  E  H 
such  that 

|| Aip  —  Xo'ipW  <  £  HV’I  • 

Show  that  there  exists  A  E  cr(A)  such  that  |A  —  Aq|  <  £. 
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10.  Suppose  Vi  and  V2  are  two  closed  subspaces  of  H,  with  associated 
orthogonal  projections  Pi  and  P2.  Show  that  V\  and  V2  are  orthogonal 
if  and  only  if  P1P2  =  0. 

11.  Suppose  g  is  a  project  ion- valued  measure  on  (X,  Q).  Show  that  for 
any  Ei,E2  G  U,  ii[E\)(i[E2)  is  the  projection  onto  the  closed  sub¬ 
space  Range (/i( £1))  D  Range (//(E^)). 

Hint :  Write  E\  as  E\  =  (E\  H^)  U  (Ei\E2)  and  use  Exercise  10. 

12.  Prove  Proposition  7.24. 

Hint :  Use  Proposition  7.22  and  the  Radon-Nikodym  theorem 
(Theorem  A. 6). 
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The  Spectral  Theorem  for  Bounded 
Self-Adjoint  Operators:  Proofs 


In  this  chapter  we  give  proofs  of  all  versions  of  the  spectral  theorem  stated 
in  the  previous  chapter. 


8.1  Proof  of  the  Spectral  Theorem,  First  Version 

A  proof  of  the  spectral  theorem,  in  its  projection- valued  measure  form,  can 
be  obtained  in  two  main  stages.  The  first  stage  of  the  proof  is  to  define  a 
continuous  functional  calculus,  meaning  we  associate  with  each  continuous 
function  /  on  cr  (A)  an  operator  f(A).  The  map  /  f(A)  should  have  the 
property  that  if  /  is  the  function  /(A)  =  Am,  then  f(A)  =  A171 .  The  contin¬ 
uous  functional  calculus  is  then  constructed  by  approximating  continuous 
functions  on  a  (A)  by  polynomials.  The  Stone- Weierstrass  theorem  tells  us 
that  polynomials  are  dense  in  the  continuous  functions  on  cr(A);  it  remains 
only  to  show  that  if  a  sequence  pn  of  polynomials  converges  uniformly  to 
some  continuous  function  /  on  cr(A),  then  the  operators  pn(A)  converge  to 
some  operator,  which  we  will  then  call  f(A). 

The  second  stage  of  the  proof  is  to  show  that  the  continuous  functional 
calculus  can  be  represented  as  integration  against  a  project  ion- valued  mea¬ 
sure.  This  result  is  just  an  operator- valued  version  of  the  Riesz  represen¬ 
tation  theorem  from  measure  theory  (Theorem  8.5).  Indeed,  we  will  see 
that  this  operator- valued  version  of  the  Riesz  representation  theorem  can 
be  reduced  to  the  usual  form  of  the  theorem. 
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8.  The  Spectral  Theorem  for  Bounded  Self-Adjoint  Operators:  Proofs 


8.1.1  Stage  1:  The  Continuous  Functional  Calculus 
We  begin  by  defining,  for  any  A  e  £>(H),  the  spectral  radius  R(A)  by 


R(A) 


sup 

Aecr(A) 


(By  Propositions  7.5  and  7.7,  cr(A)  is  a  nonempty,  bounded  subset  of  R.) 
According  to  Point  2  of  Proposition  7.5,  we  have 


R(A)  <  \\A 


for  any  A  E  £>( H).  In  general,  \\A\\  can  be  much  bigger  than  R(A).  For  ex¬ 
ample,  if  A  is  a  nilpotent  matrix,  then  R(A)  =  0  but  ||A||  can  be  arbitrarily 
large. 


Lemma  8.1  If  A  £  m)  is  self-adjoint,  the  norm  and  the  spectral  radius 
of  A  are  equal: 

||  A||  =  R(A). 


In  preparation  for  the  proof,  we  determine  the  radius  of  convergence  of 
the  power  series  for  the  resolvent  given  in  the  proof  of  Proposition  7.5. 
According  to  Proposition  7.2,  we  have 


|A*A 


2 


for  any  A  E  S(H).  If  A  is  self-adjoint,  we  obtain 


Iterating  this  relation  gives 


for  all  n. 

Consider,  for  a  bounded  self-adjoint  operator  A ,  the  following  formal 
expression  for  the  resolvent  of  A: 


(A  -XI)-1 


m— 0 


If  |A  >  ||  A ||,  then  the  proof  of  Proposition  7.5  shows  that  the  series  (8.2) 
converges  in  the  operator  norm  topology  and  that  the  sum  of  the  series  is 
indeed  the  inverse  of  (A  —  XI).  If,  on  the  other  hand,  |A|  <  ||A||,  it  follows 
from  (8.1)  that  the  norms  of  the  terms  in  (8.2)  do  not  tend  to  zero,  and 
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so  the  series  cannot  converge  in  the  operator  norm  topology.  We  may  say, 
then,  that  the  series  (8.2)  has  radius  of  convergence  equal  to  ||A||. 

Proof  of  Lemma  8.1.  We  know  that  6(A)  <  \\A\\.  To  show  that  6(A)  = 
|| A ||,  we  wish  to  argue  that  (A  —  A/)-1  is  a  holomorphic  operator- valued 
function  of  A  on  the  set  |A|  >  6(A ),  and  therefore  the  Laurent  series 
of  ( A  —  A/)-1  must  converge  for  |A|  >  6(A).  But  the  Laurent  series  of 
(. A  —  A/)-1  is  just  the  series  in  (8.2),  and  we  have  shown  that  the  series 
diverges  when  |A|  <  ||A||.  This  would  be  a  contradiction  if  R(A)  were  less 
than  ||  A  || . 

To  flesh  out  the  argument,  recall  the  formula  (T.8)  in  the  proof  of  Propo¬ 
sition  7.5  for  the  resolvent  of  A. 

That  formula  expresses  the  map  A  i-t  (A  —  A/)-1  as  a  convergent  power 
series  in  powers  of  A  —  Aq,  near  any  point  Ao  in  the  resolvent  set  of  A.  It 
follows  that  for  any  bounded  linear  functional  £  e  B{ H)*,  the  complex¬ 
valued  function 

A^e(pL-AJ)-1) 

is  holomorphic  on  the  resolvent  set  of  A.  This  function  has  a  unique  Laurent 
series,  which  is  given  by  applying  £  term  by  term  to  (8.2).  The  series  will 
converge  on  the  largest  annulus  contained  in  the  resolvent  set  of  A,  namely 
the  set  of  A  with  |A|  >  6(A). 

Convergence  of  (8.2)  means  that  |£(Am/Am+1)  is  bounded  as  function 
of  m,  for  each  £  and  each  A  with  |A|  >  6(A).  Thus,  by  (a  corollary  of)  the 
uniform  boundedness  principle  (Appendix  A. 3. 4),  the  set  {A171  / Am+1}^=0 
is  bounded  in  the  Banach  space  6(H),  for  all  A  with  |A|  >  6(A).  In  par¬ 
ticular,  for  each  A  with  |A|  >  6(A),  there  is  a  constant  C  such  that 


A2" 

 1 

^4 

1 2n 

A 

2n 

A 

2n 

<  c. 


If  ||  A ||  were  greater  than  6(A),  this  inequality  would  be  false  for  A  satisfying 
6(A)  <  |A|  <  ||A||.  ■ 

The  next  key  step  in  Stage  1  of  the  proof  is  to  understand  how  the 
spectrum  of  a  self-adjoint  operator  transforms  under  application  of  a  poly¬ 
nomial. 


Lemma  8.2  (Spectral  Mapping  Theorem)  For  all  A  E  B(  H)  and  all 

polynomials  p,  we  have 


O  (p(A))  =  p(a(A)). 

That  is  to  say,  the  spectrum  of  p(A)  consists  precisely  of  the  numbers  of 
the  form  p{ A),  with  A  in  the  spectrum  of  A. 

Proof.  The  result  is  trivial  if  p  is  constant.  When  degp  >  i,  let  p  given  by 

p(z)  =  anzn  -I-  an-\zn~x  H - ha0 
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be  an  arbitrary  polynomial.  We  first  show  that  p(a(A))  C  a(p(A)). 
Suppose,  then,  that  A  E  cr(A).  Observe  that 

p(A)  -  p(X)I  =  an(An  -  A"/)  +  an-i(An~1  -  A"-1/)  +  •  •  •  +  a0J  -  a0I. 
Now, 


Ak  -  A kI  =  (A  -  A I){Ak~l  +  Ay4fc“2  +  A 2Ak~3  +  •  •  •  +  Xk~1I). 

Thus,  we  can  pull  out  a  factor  of  ( A  —  XI)  from  each  nonzero  term  in 
P(A)  -  p(X)I,  giving 


p(A)  -  p(X)I  =  (A-  X I)q(A) 

where  q  is  a  polynomial  (depending  on  A).  Since,  by  assumption,  A  —  XI  is 
not  invertible,  and  since  (A  — XI)  commutes  with  q(A),  (A  —  XI)q(A)  cannot 
be  invertible  (Exercise  1).  This  shows  that  p( X)  belongs  to  the  spectrum  of 
P(A). 

We  now  show  that  cr(p(A))  C  p(a(A)).  Suppose,  then,  that  7  E  a{p(A)). 
Since  C  is  algebraically  closed,  we  can  factor  the  polynomial  p(z)  —  7,  as  a 
function  of  z,  as 


p(z)  -j  =  c(z-  bi)(z  -  b2)  •  •  •  (z  -  bn).  (8.3) 


Thus, 

p(A)  -  7/  =  c(A  -  bJ)(A  -  b2I)  •  •  •  (A  -  bnI). 

Since  p(A)  —  7 1  is  assumed  to  be  noninvertible,  there  must  be  some  j  such 
that  ( A  —  bjl)  is  noninvertible,  that  is,  for  which  bj  E  cr(A).  Then  (8.3) 
tells  us  that  p(bj)  —7  =  0,  meaning  that  7  =  p(bj).  Thus,  7  is  of  the  form 
p( X)  for  some  A  (=  bj)  in  cr (A),  m 

The  last  step  in  Stage  1  of  our  proof  is  to  apply  the  Stone- Weierstrass 
theorem  to  show  that  polynomials  are  dense  in  C(<r(A);R)  (the  space  of 
continuous,  real- valued  functions  on  cr(A))  with  respect  to  the  supremum 
norm. 

Proposition  8.3  Suppose  A  E  B{  H)  is  self-adjoint.  Then  there  exists  a 
unique  bounded  linear  map  from  C(cr(A);R)  into  6(H),  denoted  by  f  i-t 
f(A),  such  that  when  /(A)  =  Am,  we  have  f(A)  =  Am .  The  map  f  1— >>  f(A), 
f  E  C(cr(A);R),  is  called  the  (real-valued)  functional  calculus  for  A. 

Proof.  Note  that  if  A  is  self-adjoint,  then  p(A)  is  self-adjoint  provided 
that  p  is  a  real-valued  polynomial  (i.e.,  one  where  all  the  coefficients  are 
real  numbers).  Thus,  combining  the  spectral  mapping  theorem  with  the 
equality  of  the  norm  and  spectral  radius,  we  have  the  following:  If  A  is  a 
self-adjoint  operator  and  p  is  a  real-valued  polynomial,  then 

||p(A)||  =  sup  \p(X) 

Xecr(A) 


(8.4) 
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Thus,  the  map  p  — >■  p(A)  is  an  isometric  linear  map  from  the  space  of 
polynomials  on  cr(A)  (with  the  supremum  norm)  into  the  space  of  bounded 
operators  on  H. 

According  to  the  Stone- Weierstrass  theorem  polynomials  are  dense  in 
C(a(A);  R).  Thus,  by  the  BLT  theorem  (Theorem  A. 36),  we  can  extend  the 
map  p  i-t  p(A)  uniquely  to  a  bounded  linear  map  of  C(cr(A);  R)  into  B( H). 


Proposition  8.4  If  A  £  B( H)  is  self-adjoint ,  the  (real-valued)  continuous 
functional  calculus  for  A,  mapping  C(cr(A);R)  into  B( H),  has  the  following 
properties. 

1.  Multiplicativity:  For  all  f,g,  we  have 

(fg)(A)  =  f(A)g(A), 

where  fg  denotes  the  pointwise  product  of  f  and  g. 

2.  Self-adjointness:  For  all  f ,  the  operator  f(A)  is  self-adjoint. 

3.  Non- negativity:  For  all  f,  if  f  is  non-negative,  then  f(A)  is  a  non¬ 
negative  operator. 

j.  Norm  and  spectrum  properties:  For  all  f ,  we  have 

\\f(A)\\  =  sup  | / (A) |  (8.5) 

ACcr(A) 

and 

v(f(A))  =  UW  |A  €  a(A) }  .  (8.6) 

Proof.  Point  1  holds  for  polynomials  and  thus,  by  taking  limits,  for  all 
/  G  C(cr(A);R).  Furthermore,  if  p  is  a  real- valued  polynomial  and  A  is 
self-adjoint,  then  p(A)  is  self-adjoint.  From  this,  we  get  Point  2  by  taking 
limits.  If  /  G  C(cr(A);R)  is  non-negative,  then  /  =  g2 ,  where  g  =  \/~f  is 
real-valued.  Thus,  g(A)  is  self-adjoint  and  for  all  G  H,  Point  1  tells  us 
that 

('•P ,  f  {A)tp)  =  (ip,g(A)2^)  =  (g(A)ip,g(A)ip)  >  0,  (8.7) 

which  establishes  Point  3.  We  have  already  established  (8.5)  in  (8.4)  for 
polynomials;  the  result  for  general  /  G  C(cr(A);  R)  follows  by  taking  limits. 

To  establish  (8.6),  suppose  first  that  Ao  G  C  is  not  in  the  range  of  /. 
Then  the  function  g{ A)  :=  l/(/(A)  —  Ao)  is  continuous  on  <j(A)  and  the 
operator  g(A)  will  be  the  inverse  of  f(A)  —  Ao /,  showing  that  Ao  is  not  in 
the  spectrum  of  f(A). 

In  the  other  direction,  suppose  that  Aq  =  f{p)  for  some  g  G  <j(A);  we 
want  to  show  that  f{y)  G  a(f(A)).  Suppose  now  that  f(A)  —  f{p)I  were 
invertible  and  choose  a  sequence  pn  of  polynomials  converging  uniformly 
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to  /  on  cr(A).  By  Exercise  8  in  Chap.  7,  any  operator  sufficiently  close  to 
f(A )  —  f(g)I  in  the  operator  norm  topology  would  also  be  invertible.  In 
particular,  pn(A)  —  pn(g)I  would  have  to  be  invertible  for  all  sufficiently 
large  n,  contradicting  the  spectral  mapping  theorem.  ■ 


8.1.2  Stage  2:  An  Operator-Valued  Riesz  Representation 
Theorem 

We  turn  now  to  Stage  2  of  the  proof  of  the  spectral  theorem.  We  will  make 
use  of  the  Riesz  representation  theorem  from  measure  theory  ( not  the  result 
about  continuous  linear  functionals  on  a  Hilbert  space).  The  following  form 
of  this  result  is  sufficient  for  our  purposes. 

Theorem  8.5  (Riesz  Representation  Theorem)  Let  X  be  a  compact 
metric  space  and  let  C(X;R)  denote  the  space  of  continuous ,  real-valued 
functions  on  X.  Suppose  A  :  C(X;  R)  R  is  a  linear  functional  with  the 
property  that  A (/)  is  non-negative  whenever  all  the  values  of  f  are  non¬ 
negative.  Then  there  exists  a  unique  (real-valued,  positive)  measure  g  on 
the  Borel  a -algebra  in  X  for  which 


A (/)  =  [  f  dfi 

Jx 

for  all  f  G  C{X;R). 

See  pp.  353-354  of  Volume  I  of  [34]  for  a  short  proof  in  the  case  in  which 
X  is  a  compact  subset  of  R,  which  is  all  we  really  require.  For  the  full  result 
stated  above,  see  Theorems  7.2  and  7.8  in  [12].  Observe  that  g  is  a  finite 
measure,  with  g(X)  =  A(l),  where  1  is  the  constant  function. 

Given  a  bounded  self-adjoint  operator  A  G  B{  H),  we  have  constructed, 
in  the  previous  subsection,  a  continuous  functional  calculus  for  A.  This 
calculus  is  a  map,  denoted  /  i— >>  f(A),  from  C(cr(A);R)  into  6(H).  If  /  G 
C(cr(A);  R)  is  non-negative,  then  (Point  3  of  Proposition  8.4)  f(A)  is  a  non¬ 
negative  operator.  Thus,  given  if  G  H,  if  we  define  a  linear  functional  A^ 
on  C(cr(A);R)  by  the  formula 


Ay «(/)  =  {ipJ(A)ip) , 


A qj  will  satisfy  the  hypotheses  of  the  Riesz  representation  theorem.  Thus, 
for  each  if  G  H,  we  obtain  a  unique  measure  g^  such  that 


(V /(A)V’)  =  f  /(A)  dfiip(X) 

Ja(A) 


for  all  /  G  C(a(A);  R).  Note  that 

H^{a{A))  =  Ay,(l) 


(8.9) 
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Definition  8.6  If  f  is  a  bounded  measurable  (complex- valued)  function  on 
cr(A);  define  a  map  Qf  :  H  C  by  the  formula 

QfW  =  [  /(A)  A), 

Jcr(A) 


where  /iq  is  the  measure  in  (8.8). 

If  /  happens  to  be  real  valued  and  continuous,  then  Qf(ip)  is  equal 
(Vh  f(A)'ip),  in  which  case  Qf  is  a  bounded  quadratic  form.  (See  Defini¬ 
tion  A. 60  and  Example  A. 62.)  It  turns  out  that  Qf  is  a  bounded  quadratic 
form  for  any  bounded  measurable  /,  in  which  case  Proposition  A. 63  allows 
us  to  associate  with  Qf  a  bounded  operator,  which  we  denote  by  /(A). 
Once  the  relevant  properties  of  /(A)  are  established,  we  will  construct  the 
desired  project  ion- valued  measure  by  setting  pA(E)  1  e(A). 

Proposition  8.7  For  any  bounded  measurable  function  f  on  cr(A),  the 
map  Q f  in  Definition  8.6  is  a  bounded  quadratic  form. 

Proof.  Let  T  denote  the  space  of  all  bounded,  Borel-measurable  func¬ 
tions  /  for  which  Qf  is  a  quadratic  form.  Then  T  is  a  vector  space  and 
contains  C(cr(A);R).  Furthermore,  T  is  closed  under  uniformly  bounded 
pointwise  limits,  because  Qf(i()  is  continuous  with  respect  to  such  limits, 
by  dominated  convergence.  Standard  measure-theoretic  techniques  (Exer¬ 
cise  3)  then  show  that  T  is  the  space  of  all  bounded  Borel-measurable 
functions  on  X. 

Meanwhile,  it  follows  from  (8.9)  that 


<2/0)1  <  sup  | / (A) 

Aecr(A) 


showing  that  Qf  is  always  a  bounded  quadratic  form.  ■ 

Definition  8.8  For  a  bounded  measurable  function  f  on  cr(A),  let  /(A)  be 
the  operator  associated  to  the  quadratic  form  Qf  by  Proposition  A. 63.  This 
means  that  /(A)  is  the  unique  operator  such  that 


(V/OO)  =  <2/0)  =  [  f  dn^ 

Ja(A) 


for  all  if  G  H. 

Observe  that  if  /  is  real  valued,  then  Qffy)  is  real  for  all  if  E  H,  which 
means  (Proposition  A. 63)  that  the  associated  operator  f(A)  is  self-adjoint. 
We  will  shortly  associate  with  A  a  projection- valued  measure  /iA,  and  we 
will  show  that  /(A),  as  given  by  Definition  8.8,  agrees  with  /(A)  as  given 
by  /(A)  d/aA( A).  [See  (8.10)  and  compare  Definition  7.13.] 
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Proposition  8.9  For  any  two  bounded  measurable  functions  f  and  g ,  we 
have 

=  f{A)g(A). 

Proof.  Let  T\  denote  the  space  of  bounded  measurable  functions  /  such 
that  ( fg)(A )  =  f(A)g(A)  for  all  g  E  C(cr(A);R).  Then  T\  is  a  vector  space 
and  contains  C(cr(A);R).  We  have  already  noted  that  dominated  conver¬ 
gence  guarantees  that  the  map  /  fj  E  H,  is  continuous  un¬ 

der  uniformly  bounded  pointwise  convergence.  By  the  polarization  identity 
(Proposition  A. 59),  the  same  is  true  for  the  map  /  i— >>  -0),  where  Lf  is 

the  sesquilinear  form  associated  to  Qf.  Now,  by  the  polarization  identity,  / 
will  be  in  T\  provided  that 

=  (i>,f(A)g(AW 

or,  equivalently, 

QfgiT)  =  Lf(ip,g(A)il>) 

for  all  if  E  H  and  all  g  E  C(cr(A);R).  From  this,  we  can  see  that  T\  is 
closed  under  uniformly  bounded  pointwise  limits.  Thus,  by  Exercise  3,  T\ 
consists  of  all  bounded,  Borel-measurable  functions. 

We  now  let  T2  denote  the  space  of  all  bounded,  Borel-measurable  func¬ 
tions  /  such  that  ( fg)(A )  =  f{A)g{A)  for  all  bounded  Borel-measurable 
functions  g.  Our  result  for  T\  shows  that  T2  contains  C(cr(A);R).  Thus, 
the  same  argument  as  for  T\  shows  that  T2  consists  of  all  bounded,  Borel- 
measurable  functions.  ■ 


Theorem  8.10  Suppose  A  E  8(H)  is  self-adjoint.  For  any  measurable  set 
E  C  cr(A),  define  an  operator  gA(E)  by 

0(E)  =  1  e(A), 

where  1  e(A)  is  given  by  Definition  8.8.  Then  gA  is  a  projection- valued 
measure  on  cr(A)  and  satisfies 


A  dgA{ A)  =  A. 


Theorem  8.10  establishes  the  existence  of  the  project  ion- valued  measure 
in  our  first  version  of  the  spectral  theorem  (Theorem  7.12). 

Proof.  Since  1#  is  real- valued  and  satisfies  1e  •  1#  =  1#,  Proposition  8.4 
tells  us  that  1  e(A)  is  self-adjoint  and  satishes  1e(A)2  =  1  e(A).  Thus, 
gA(E)  is  an  orthogonal  projection  (Proposition  A. 57),  for  any  measurable 
set  E  C  X.  If  Ei  and  E2  are  measurable  sets,  then  1e1he2  =  ^e±  •  1e2 
and  so 

0(E1nE2)  =  0(E1)0(E2). 

If  E\,E2,  .  •  •  are  disjoint  measurable  sets,  then  gA  (Ej)  gA  (Ek)= gA  (0)= 0, 
for  j  7^  fc,  and  so  the  ranges  of  the  projections  gA(Ej)  and  gA{Ek)  are 
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orthogonal.  It  then  follows  by  an  elementary  argument  that,  for  all  ^  gH, 
we  have 

oo 

=  Pip, 

3  = 1 

where  the  sum  converges  in  the  norm  topology  of  H  and  where  P  is  the 
orthogonal  projection  onto  the  smallest  closed  subspace  containing  the 
range  of  pA(Ej)  for  every  j.  On  the  other  hand,  if  E  :=  U j?=1Ej,  then 

the  sequence  /n  :=  i  1  e<  is  uniformly  bounded  (by  1)  and  converges 
pointwise  to  1  e-  Thus,  using  again  dominated  convergence  in  (8.8), 


lim 

TV— )>oo 


N 


3- 


( xp,lE(A)ip ) . 


It  follows  that  1  e(A)  coincides  with  P,  which  establishes  the  desired 
countable  additivity  for  fiA . 

Finally,  if  /  =  1e  for  some  Borel  set  P,  then 


/(A)  dnA( A) 


f(A), 


(8.10) 


where  f(A)  is  given  by  Definition  8.8.  [The  integral  is  equal  to  /iA(P),  which 
is,  by  definition,  equal  to  1  e(A).]  The  equality  (8.10)  then  holds  for  simple 
functions  by  linearity  and  for  all  bounded,  Borel-measurable  functions  by 
taking  limits.  In  particular,  if  /(A)  =  A,  then  the  integral  of  /  against  \iA 
agrees  with  f(A)  as  defined  in  Definition  8.8,  which  agrees  with  f(A )  as 
defined  in  the  continuous  functional  calculus,  which  in  turn  agrees  with 
f(A)  as  defined  for  polynomials — namely,  f(A)  =  A.  This  means  that 


A  dfiA{\)  =  A 


as  desired.  ■ 

We  have  now  completed  the  existence  of  the  projection-valued  measure 
HA  in  Theorem  7.12.  The  uniqueness  of  fiA  is  left  as  an  exercise  (Exercise  4). 
We  close  this  section  by  proving  Proposition  7.16,  which  states  that  if  a 
bounded  operator  B  commutes  with  a  bounded  self-adjoint  operator  A, 
then  B  commutes  with  /(A),  for  all  bounded,  Borel-measurable  functions 
/  on  <j(A). 

Proof  of  Proposition  7.16.  If  B  commutes  with  A,  then  B  commutes 
with  p(A),  for  any  polynomial  p.  Thus,  by  taking  limits  as  in  the  construc¬ 
tion  of  the  continuous  functional  calculus,  B  will  commute  with  f(A )  for 
any  continuous  real- valued  function  /  on  cr(A).  We  now  let  T  denote  the 
space  of  all  bounded,  Borel-measurable  functions  /  on  cr (A)  for  which  f(A) 
commutes  with  P,  so  that  C(a(A);  M). 
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To  show  that  a  bounded  measurable  /  belongs  to  T ,  it  suffices  to  show 
that  for  all  </>,  E  H  we  have  (</>,  f(A)Bip)  =  (</>,  Bf(A)'ijj),  or,  equivalently, 
(</>,  f(A)Bip)  =  f(A)'ip) .  That  is,  we  want 

Lf{4>,B^)  =  Lf{B*4>^). 

But  we  have  seen  that  for  fixed  vectors  t/t  ,  ^2  £  H,  the  map  /  Lf(ip  1,^2) 
is  continuous  under  uniformly  bounded  pointwise  limits.  Thus,  T  is  closed 
under  such  limits,  which  implies  (Exercise  3)  that  T  contains  all  bounded, 
Borel-measurable  functions.  ■ 


8.2  Proof  of  the  Spectral  Theorem,  Second  Version 

We  now  turn  to  the  proof  of  Theorem  7.19.  As  in  the  proof  of  Theorem  7.12, 
we  will  make  use  of  continuous  functional  calculus  for  a  bounded  self-adjoint 
operator  A  and  the  Riesz  representation  theorem.  We  begin  by  establishing 
the  special  case  in  which  A  has  a  cyclic  vector,  that  is,  a  vector  ip  with 
the  property  that  the  vectors  Akip,  k  =  0, 1,  2, . . .,  span  a  dense  subspace 
of  H.  In  that  case,  the  direct  integral  will  be  simply  an  L 2  space  (i.e.,  the 
Hilbert  spaces  are  equal  to  C  for  all  A).  Thus,  in  this  special  case,  the  di¬ 
rect  integral  and  multiplication  operator  versions  of  the  spectral  theorem 
coincide. 

Lemma  8.11  Suppose  A  E  6(H)  is  self-adjoint  and  is  a  cyclic  vector 
for  A.  Let  be  the  unique  measure  on  cr(A),  given  by  Theorem  8.5,  for 
which 

(ip,f(A)ip}=[  /(A)  d^{\)  (8.11) 

Ja(A ) 

for  all  f  E  C(cr(A);R).  Then  there  exists  a  unitary  map 

[/:H^lV(A),W) 


such  that 

VWV]  (A)  =  A<)>(A) 

for  all  <fi  E  L2 (a (A) , /Jj/j) . 

Proof.  We  start  by  defining  U  on  the  complex  vector  space  of  vectors  of 
the  form  p(A)if>,  where  p  is  a  complex-valued  polynomial,  as  follows: 

U\p(A)ip\  =  p. 

To  show  that  U  is  well  defined,  write  p  as  p  =  pi  +  ip2,  where  pi  and  P2 
are  real- valued  polynomials.  Since  pi(A)  and  P2(A)  are  self-adjoint  and 
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commuting,  we  obtain 


(p{A)ip,p(A)ip) 


<P,  [pi(A)2  +p2(A)2]  ip 


PiW2  +  P2P)2]  dpip( A), 


(8.12) 


by  canceling  cross  terms  and  applying  (8.11).  Thus,  if  p(A)pj  =  0  in  H, 
then  p{ A)  =  0  for  /i^-almost  every  A  in  cr(A),  so  that  p  represents  the  zero 
element  of  L2(cr(A), /i^). 

Equation  (8.12)  shows  also  that  the  map  U  is  isometric  on  its  initial 
domain.  This  initial  domain  is  dense  in  H  since  it  contains  the  vectors 
Ak 'ip  and  fj  is  cyclic.  Thus,  the  BLT  theorem  (Theorem  A. 36)  tells  us  that 
U  extends  uniquely  to  an  isometric  map  of  H  into  L2(cr(A),  p^).  Since 
polynomials  are  dense  in  L2(cr(A),  p^)  (by  the  Stone- Weierstrass  theorem 
and  Theorem  A.  10),  U  actually  is  unitary. 

Now,  since  U  takes  Akip  to  the  function  A  xk  in  L2(cr(A),/i^,),  we 
have  that  U  AU~1(Xk)  =  XkA1 .  Thus, 


[UAU~1p](X)  =  Xp(X) 


for  all  polynomials  p.  Since  polynomials  are  dense  in  L2(cr(A),  p^),  we  have 
[Z7AZ7-1</>](A)  =  X<p(X)  for  all  f  G  L2(a(A),  p^),  as  claimed.  ■ 

Lemma  8.12  Suppose  A  G  B(  H)  is  self-adjoint  and  pA  is  the  associated 
projection- valued  measure  on  cr(A),  as  in  Theorem  8.10.  Then  there  exists 
a  non-negative  real-valued  measure  p  on  a  (A)  such  that  for  all  Borel  sets 
E  C  <j(A),  we  have  pA(E)  =  0  if  and  only  if  p(E)  =  0. 


Proof.  Let  {ej}  be  an  orthonormal  basis  for  H  and  let  pe.  be  the  associated 
real-valued  measures,  given  by  pCj(E)  =  (ej,  pA(E)ejY  Then  pe.(a(A))  = 
(ej,Iej)  =  1  for  all  j.  Thus,  the  formula 

l1 

3  J 


defines  a  finite  measure  on  cr(A).  Given  some  Borel  set  E  C  cr(A),  if 
pA(E)  =  0,  then  pCj(E)  =  0  for  all  j  and  so  p(E)  =  0.  Conversely,  if 
p(E)  =  0,  then 

0  =  (ej,pA(E)ej)  =  (pA(E)ej,pA(E)ej) 

for  all  j,  since  pA{E)  is  self-adjoint  and  pA{E )2  =  pA{E).  Thus,  pA{E)ej  = 
0  for  all  j,  which  means  that  pA(E)  =  0.  ■ 

Lemma  8.13  If  A  £  m)  is  self-adjoint,  then  H  can  be  decomposed  as 
an  orthogonal  direct  sum  of  closed  nonzero  subspaces  Wj,  where  each  Wj  is 
invariant  under  A  and  where  the  restriction  of  A  to  Wj  has  a  cyclic  vector 
fjj .  The  number  of  Wj  7s  is  either  finite  or  countably  infinite. 
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Proof.  Recall  our  standing  assumption  that  H  is  separable,  and  let  {4>j} 
be  a  countable  dense  subset  of  H.  Let  W\  be  the  closed  subspace  of  H 
spanned  by  </> i,  A(j) i,  A2</>i,  . . ..  Then  W\  is  invariant  under  A  and  ?/t  :=  (j) i 
is  a  cyclic  vector  for  A\Wi.  If  W\  =  H  then  we  are  done.  If  not,  let  j  be 
the  smallest  number  such  that  (f)j  is  not  contained  in  W\.  Let  ^2  be  the 
orthogonal  projection  of  (j)j  onto  the  orthogonal  complement  of  Wi,  and  let 
W2  be  the  closed  span  of  ^2,  Aip 2,  A2?/^,  •  •  ••  Then  W2  is  invariant  under  A 
and  ip2  is  a  cyclic  vector  for  A\w^.  Furthermore,  since  A  is  self-adjoint  and 

leaves  W\  invariant,  it  also  leaves  W^~  invariant,  which  means  that  Akrip2 
is  orthogonal  to  W\  for  all  /c,  so  that  W2  is  orthogonal  to  W\. 

If,  now,  W\  ©  W2  =  H,  we  are  done.  If  not,  we  let  k  be  the  smallest 
number  such  that  is  not  in  W\  ©  W2  and  we  let  ^ 3  be  the  projection 
of  <frk  onto  the  orthogonal  complement  of  W\  ©  W2,  and  so  on.  Continuing 
on  in  this  way,  we  obtain  an  orthogonal  collection  of  closed  subspaces  that 
are  invariant  under  A ,  each  of  which  has  a  cyclic  vector.  Either  the  process 
terminates  with  finitely  many  of  these  subspaces  spanning  H,  or  we  get  an 
infinite  family.  In  the  latter  case,  each  <fij  belongs  to  the  span  of  the  W ^ s 
and  hence  the  (Hilbert  space)  direct  sum  of  the  IT}’  s  is  all  of  H.  ■ 

We  are  now  ready  for  the  proof  of  our  second  form  of  the  spectral  theo¬ 
rem. 

Proof  of  Theorem  7.19.  Let  {IT},  vpj}  be  as  in  Lemma  8.13,  and  let  Aj 
denote  the  restriction  of  A  to  IT},  which  is  a  bounded  self-adjoint  operator 
on  the  Hilbert  space  IT}.  For  each  Aj ,  we  can  obtain  a  unitary  map  Uj  as  in 
Lemma  8.11,  and  we  wish  to  piece  these  maps  together  for  different  values 
of  j  to  obtain  a  direct  integral  decomposition  for  all  of  H.  To  facilitate 
piecing  the  maps  together,  we  will  modify  the  Uj’ s  so  that  they  all  map  to 
L2  spaces  over  a  subset  of  cr(A)  with  respect  to  the  same  measure  fi. 

If  we  apply  Lemma  8.11  to  Aj,  we  get  a  unitary  map 

Uj  :  Wj  -4  L2{a{Aj),^j) 

such  that  UjAU~l  is  the  operator  of  multiplication  by  A.  Here,  is  the 

measure  on  cr(Aj)  given  by  =  (i/jj,  /iAj  (E)?/q).  Now,  according  to 

Exercise  5,  the  spectrum  of  Aj  is  contained  in  the  spectrum  of  A.  Fur¬ 
thermore,  if  E  is  a  measurable  subset  of  cr(Aj)  C  <7  (A),  then  1  e  may  be 
thought  of  as  a  measurable  function  either  on  cr(Aj)  or  on  cr(A).  Exercise  5 
tells  us  that  l^(Aj),  as  defined  by  the  functional  calculus  for  Aj,  coincides 
with  the  restriction  to  Wj  of  1  e(A).  Thus,  if  1  e(A)  =  0  then  l^(Aj)  =  0 
as  well.  Equivalently,  if  /aA(E)  =  0  then  fiAj  (E)  =  0,  where  /aAj  is  the 
projection- valued  measure  associated  to  the  self-adjoint  operator  Aj. 

Let  us  now  choose  a  measure  (i  as  in  Lemma  8.12.  Any  set  of  measure 
zero  for  g  is  a  set  of  measure  zero  for  fiA  and  thus  also  for  /aAj  and  then 
for  fi^o.  Thus,  if  we  extend  to  a  measure  on  cr (A)  by  making  it  zero  on 
cr(A)  \  cr(Aj),  we  have  that  fi^.  is  absolutely  continuous  with  respect  to  fi. 
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By  the  Radon-Nikodym  theorem  (Theorem  A. 6),  each  pp.  has  a  density 
pj  with  respect  to  /q  and  this  density  is  nonzero  /i^. -almost  everywhere. 
Now,  the  map 

£  1/2  J, 

f  ->•  p/  f 

is  easily  seen  to  be  a  unitary  map  of  L2(<j(Aj),  ppj )  to  L2(cr(Aj),  p).  Thus, 
we  can  define  a  unitary  map 


by  setting 

=  ft(A)1/2(V'^)(A). 


Since  multiplication  by  (p^)1/2  commutes  with  multiplication  by  A,  we  have 


(UjAfip)  m\)  =  Xip(X). 

Now,  L2(cr(Aj),  /i)  can  be  thought  of  as  a  direct  integral  over  cr(A)  with 
respect  to  /i,  where  we  take  =  C  for  A  E  cr(Aj)  and  we  take  H^  =  {0} 
if  A  E  cr(Aj)c.  We  now  define  another  direct  integral  over  a  (A)  in  which 
the  Hilbert  spaces  Ha,  A  E  cr (A),  are  defined  by 

Ha  =  ©Hi. 

j 


Here  the  measurable  structure  on  the  direct  integral  is  defined  by  setting 


(0,0, ... ,  1,0,0, . . .),  A  E  Ej 

(0,  0, . . . ,  0,  0,  0, . . .),  Ae  Ej  7 


where  the  1  is  in  the  jth  slot.  Since  each  Ha  is  a  direct  sum  of  the  H^’s, 
the  direct  integral  of  the  Ha’s  is  the  Hilbert  space  direct  sum  of  the  direct 
integral  of  the  H^’s,  which  is  just  L2(cr(Aj),  p). 

Meanwhile,  H  is  the  direct  sum  of  the  Wj’s,  and  we  have  unitary  maps 
Uj  of  Wj  to  L2(cr(Aj),  p)  such  that  UjAU~l  is  just  multiplication  by  A  on 

L2(Ej ,  p).  Thus,  we  can  assemble  the  C/j’s  into  a  single  unitary  map  U  of  H 
to  the  integral  of  the  Ha’s,  and  we  will  have  U AU~X  equal  to  multiplication 
by  A,  as  desired.  ■ 

In  the  interest  of  brevity,  we  will  not  give  a  complete  proof  of  Proposi¬ 
tion  7.22  (uniqueness  in  Theorem  7.19),  but  only  indicate  the  main  ideas. 
To  establish  the  equivalence  of  p ^  and  p^2\  we  observe  that  both  mea¬ 
sures  have  the  same  sets  of  measure  zero  as  the  projection-valued  measure 
pA  (Proposition  7.23).  Meanwhile,  if  we  have  two  different  direct  integrals, 
each  unitarily  equivalent  to  H  as  in  (7.20),  then  there  will  be  a  unitary 
map  V  between  the  two  direct  integrals  that  commutes  with  the  opera¬ 
tor  s  (A)  As  (A).  Using  an  argument  similar  to  that  in  Exercise  7,  we 
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can  show  that  there  must  be  bounded  maps  V\  :  -G  such  that 

(Vs) (A)  =  V\s(X)  for  almost  every  A.  Then  we  argue  that  the  only  way 
V  can  be  unitary  is  if  V\  is  unitary  for  almost  every  A.  This  implies  that 

dim  =  dimH^  for  almost  every  A. 

Finally,  we  briefly  indicate  the  proof  of  the  multiplication  operator  form 
of  the  spectral  theorem. 

Proof  of  Theorem  7.20.  Let  W j  be  as  in  Lemma  8.13  and  let  A:)  be  the 
restriction  of  A  to  Wj.  By  the  proof  of  Theorem  7.19,  each  A:)  is  unitarily 
equivalent  to  multiplication  by  A  on  the  Hilbert  space  L2(cr(Aj),  fij\  for 
some  finite  measure  / ij  on  cr(Aj).  Let  X  be  the  disjoint  union  of  the  sets 
cr(Aj ),  let  fi  be  the  sum  of  the  measures  /q/,  and  let  h  be  the  function 
whose  restriction  to  each  cr(Aj)  is  the  function  A  ha  A.  Then  L2(X,  fi)  is 
the  orthogonal  direct  sum  of  the  Hilbert  spaces  L2(cr(Aj),  /ij),  which  means 
that  L2(X,  fi)  may  be  identified  unitarily  with  H  =  ®FFy  in  an  obvious  way. 
Under  this  identification,  the  operator  A  corresponds  to  multiplication  by  h. 


8.3  Exercises 

1.  (a)  Suppose  A,  B  G  m  h)  commute  and  A  is  not  invertible.  Show 

that  AB  is  not  invertible. 

Hint :  First  show  that  if  AB  were  invertible,  then  A  would  have 
both  a  left  inverse  and  a  right  inverse.  Then  show  that  the  left 
inverse  and  right  inverse  would  need  to  be  equal. 

(b)  Show  that  the  result  of  Part  (a)  is  false  if  we  omit  the  assumption 
that  A  and  B  commute. 

2.  (a)  Suppose  A  G  B( H)  is  self-adjoint  and  cr(A)  C  [0,  oo).  Show  that 

A  has  a  self-adjoint  square  root  in  B( H)  and  therefore  that  A  is 
a  non-negative  operator  (i.e.,  (-0,  A'ljj)  >  0  for  all  i/j  G  H). 

(b)  Give  an  example  of  a  bounded  operator  A  on  a  Hilbert  space 
such  that  cr(A)  C  [0,  oo)  but  A  is  not  non-negative. 

3.  Let  X  be  a  compact  metric  space  and  let  C(X;R)  denote  the  space 
of  continuous  real- valued  functions  on  X.  Suppose  that  J7  is  a  set  of 
bounded,  measurable,  complex-valued  functions  on  X  with  the  fol¬ 
lowing  properties:  (1)  T  is  a  complex  vector  space,  (2)  T  contains 
C(X;R),  and  (3)  T  is  closed  under  pointwise  limits  of  uniformly 
bounded  sequences.  (A  sequence  fn  is  uniformly  bounded  if  there 
exists  a  constant  C  such  that  \fn{x)\  <  C  for  all  n  and  x). 

(a)  Let  jCq  denote  the  collection  of  those  measurable  sets  E  for  which 
\e  is  a  uniformly  bounded  limit  of  a  sequence  of  continuous 
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functions.  Show  that  Co  is  an  algebra  and  contains  all  open  sets 
in  X. 

(b)  Let  C\  denote  the  collection  of  all  measurable  sets  in  E  for 
which  1  e  belongs  to  T .  Using  the  monotone  class  lemma  (The¬ 
orem  A. 8),  show  that  C\  consists  of  all  Borel  sets  in  X. 

(c)  Show  that  T  consists  of  all  bounded,  Borel-measurable  functions 
on  X. 


4.  Suppose  A  G  6(H)  is  self-adjoint  /iA  and  vA  are  two  projection¬ 
valued  measures  on  cr (A)  such  that 


A  duA{\)  =  A. 


Show  that  integration  with  respect  to  fiA  agrees  with  integration  with 
respect  to  vA,  first  on  polynomials,  then  on  continuous  functions,  and 
finally  on  bounded  measurable  functions.  Conclude  that  fiA  =  vA . 

Hint :  Use  Exercise  17. 


5.  Suppose  A  G  6(H)  is  self-adjoint  operator  and  V  is  a  closed  subspace 
of  H  that  is  invariant  under  A. 


(a)  Using  Proposition  7.7,  show  that  the  spectrum  of  the  restriction 
to  V  of  A  is  contained  in  the  spectrum  of  A. 

(b)  Suppose  now  that  /  is  a  bounded  measurable  function  on  cr(A), 
which  means  that  /  is  also  a  function  on  a  ( A\v)  C  cr(A).  Show 
that  V  is  invariant  under  f(A)  and  that 

f(A)\v  =  f  (^ly) , 

where  the  operator  on  the  right-hand  side  is  defined  by  the 
measurable  functional  calculus  for  the  bounded  self-adjoint  op¬ 
erator  A\y. 


6.  Suppose  A  G  6(H)  is  self-adjoint  and  ^  is  an  eigenvector  for  A,  that 
is,  a  nonzero  vector  with  Aip  =  A  ip  for  some  A  G  R.  Show  that  for 
any  bounded  measurable  function  /  on  cr(A)  we  have 


f(A)ip  =  /(A)V>. 


Hint :  Use  Exercise  5. 

7.  Suppose  K  C  R  is  a  compact  set  and  fi  is  a  finite  measure  on  K.  Let 
A  be  the  bounded  operator  on  L2  (K,  fi)  given  by 

(A<P)(  A)  =  AV>(A). 

Now  suppose  that  B  is  a  bounded  operator  on  L2{K,n)  that  com¬ 
mutes  with  A. 
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(a)  Let  (j)  =  B 1,  where  1  denotes  the  constant  function,  so  that 
p  G  L2(iT,  fi).  Show  that  for  all  continuous  functions  p  on  iT, 
we  have  Bp  =  pip. 

(b)  Using  Exercise  3,  show  that  for  all  bounded,  Borel-measurable 
functions  ip  on  iT,  we  have  Bip  =  pip. 

(c)  Show  that  p  is  essentially  bounded  (i.e.,  bounded  outside  a  set  of 
/i-measure  zero).  Conclude  that  Bp  =  pp  for  all  p  G  L2(iL,  p). 

8.  If  A  G  S(H)  is  self-adjoint,  define  U(t)  G  B(  H)  by  £/(£)  =  exp  {it  A} 
for  each  t  G  R,  where  the  exponential  is  defined  by  the  functional 
calculus  for  A. 

(a)  Show  that  U(t)  is  unitary  for  all  t  and  that  U(s)U(t)  =  U(s  + 
t).  (A  family  of  operators  with  this  property  is  called  a  one- 
parameter  unitary  group.) 

(b)  Show  that  the  map  t  U(t)  is  continuous  in  the  operator  norm 
topology. 

(c)  Give  an  example  of  a  one-parameter  unitary  group  on  a  Hilbert 
space  that  is  not  continuous  in  the  operator  norm  topology. 

See  Sect.  10.2  for  more  on  one-parameter  unitary  groups. 
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Unbounded  Self-Adjoint  Operators 


9.1  Introduction 

Recall  that  most  of  the  operators  of  quantum  mechanics,  including  those 
representing  position,  momentum,  and  energy,  are  not  defined  on  the  en¬ 
tirety  of  the  relevant  Hilbert  space,  but  only  on  a  dense  subspace  thereof. 
In  the  case  of  the  position  operator,  for  example,  given  ip  E  L2(M),  the 
function  Xip(x)  =  x\ p(x)  could  easily  fail  to  be  in  L2(M).  Nevertheless,  the 
space  of  ^’s  in  L2(M)  for  which  x,ip(x)  is  again  in  L2(R)  is  a  dense  subspace 
of  L2(M).  A  closely  related  property  of  these  operators  is  that  they  are  not 
bounded,  meaning  that  there  is  no  constant  C  such  that 

WA'ipw  <  c  HV’I 

for  all  ip  for  which  A  is  defined.  Because  our  operators  are  unbounded,  we 
cannot  use  the  BLT  (bounded  linear  transformation)  theorem  to  extend 
them  to  the  whole  Hilbert  space. 

In  this  chapter  and  the  following  one,  we  are  going  to  study  unbounded 
operators  defined  on  dense  subspaces  of  a  Hilbert  space  H.  We  will  in¬ 
troduce  the  “correct”  notion  of  self-adjointness  for  unbounded  operators, 
namely  the  one  for  which  the  spectral  theorem  holds.  As  it  turns  out,  the 
obvious  candidate  for  a  definition  of  self-adjointness,  namely  that  (</>,  Aip)  = 
(A/>,  ip)  for  all  <p  and  ip  in  the  domain  of  A ,  is  not  the  correct  one.  Rather, 
for  any  unbounded  operator  A,  we  will  define  another  unbounded  operator 
A*,  the  adjoint  of  A,  with  its  own  naturally  defined  domain.  Then  A  is 
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said  to  be  self-adjoint  if  A *  and  A  are  the  same  operators  with  the  same 
domain. 

In  the  present  chapter,  we  give  the  definition  of  an  unbounded  self-adjoint 
operator,  along  with  conditions  for  self-adjointness  and  several  examples 
and  counterexamples.  We  defer  a  discussion  of  the  spectral  theorem  itself 
until  Chap.  10.  The  statement  of  the  spectral  theorem  (either  in  terms  of 
project  ion- valued  measures  or  in  terms  of  direct  integrals)  is  essentially  the 
same  as  in  the  bounded  case,  with  only  a  few  modifications  to  deal  with 
the  domain  of  the  operator. 

Although  this  chapter  is  rather  technical,  a  reader  who  is  willing  to  ac¬ 
cept  some  things  on  faith  may  wish  simply  to  read  the  definitions  of  self- 
adjoint  and  essentially  self-adjoint  operators  in  Sect.  9.2,  and  then  skip  to 
the  statements  of  Theorem  9.21  and  Corollary  9.22  in  Sect.  9.5.  As  in  pre¬ 
vious  chapters,  H  will  denote  a  separable  Hilbert  space  over  C. 


9.2  Adjoint  and  Closure  of  an  Unbounded 
Operator 

Recall  that  we  briefly  introduced  unbounded  operators  in  Sect.  3.2.  Accord¬ 
ing  to  Definition  3.1,  an  unbounded  operator  A  on  H  is  a  linear  map  of  some 
dense  subspace  Dom(A)  C  H  (the  domain  of  A)  into  H.  As  in  Sect.  3.2, 
“unbounded”  means  “not  necessarily  bounded,”  meaning  that  we  permit 
the  case  in  which  Dom(A)  =  H  and  A  is  bounded. 

Now,  if  A  is  bounded,  then  for  any  </>,  the  linear  functional 

is  bounded.  Thus,  by  the  Riesz  theorem  (Theorem  A. 52),  there  is  a  unique 
X  such  that 

=  (x,  •)• 

We  then  define  the  adjoint  A *  of  A  by  setting  A*(f  equal  to  x*  (See 
Sect.  A. 4.) 

If  A  is  unbounded,  then  (4>,A-)  is  not  necessarily  bounded,  but  may  be 
bounded  for  certain  vectors  <f>.  If  (</>,  A-)  does  happen  to  be  bounded,  for 
some  <f>  G  H,  then  the  BLT  theorem  (Theorem  A. 36)  says  that  this  linear 
functional  has  a  unique  bounded  extension  from  Dom(A)  to  all  H.  The 
Riesz  theorem  then  tells  us  that  there  is  a  unique  x  such  that  this  linear 
functional  is  “inner  product  with  x*”  This  line  of  reasoning  leads  to  the 
following  definition,  which  was  already  introduced  briefly  in  Sect.  3.2. 

Definition  9.1  Suppose  A  is  an  operator  defined  on  a  dense  subspace 
Dom(A)  C  H.  Let  Dom(A*)  to  be  the  space  of  all  (j)  G  H  for  which  the 
linear  functional 


^  (0,  Aip) ,  ip  G  Dom(A), 
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is  bounded.  For  <p  E  Dom(A*);  define  A*<f>  to  be  the  unique  vector  such  that 
(0,  Aip)  =  ( A *(p,'ip)  for  all  ip  E  Dom(A). 

Saying  that  (</>,  A-)  is  bounded  means,  explicitly,  that  there  exists  a  con¬ 
stant  C  such  that  |(</>,  Aip)  |  <  C  ||^||  for  all  ip  E  Dom(A).  As  in  the  bounded 
case,  the  operator  A *  is  linear  on  its  domain,  and  is  called  the  adjoint  of  A. 

Another  way  to  think  about  the  definition  of  A*  is  as  follows.  Given 
a  vector  </>,  if  there  exists  a  vector  y  such  that  (<p,  Aip)  =  (y,  !p)  for  all 
ip  E  Dom(A),  then  <p  belongs  to  Dom(A*)  and  A*<p  =  y.  By  the  Riesz 
theorem,  such  a  x  will  exist  if  and  only  if  (</>,  A-)  is  bounded,  which  means 
this  way  of  thinking  about  A *  is  equivalent  to  Definition  9.1. 

Given  a  densely  defined  operator  A ,  the  adjoint  A*  of  A  could  fail  to 
be  densely  defined.  This  situation,  however,  is  a  pathology  that  does  not 
usually  occur  for  operators  of  interest  in  applications. 

Definition  9.2  An  unbounded  operator  A  on  H  is  symmetric  if 


(<£,  Aip)  =  (M,  ip) 


(9.1) 


for  all  </>,  ip  E  Dom(A). 

As  we  will  see  shortly,  if  A  is  symmetric,  then  A*  is  an  extension  of  A, 
in  the  sense  of  the  following  definition. 

Definition  9.3  An  unbounded  operator  A  is  an  extension  of  an  unbounded 
operator  B  if  Dom(A)  D  Dom(i3)  and  A  =  B  on  Dom(JB). 

If  A  is  an  extension  of  B ,  then  very  likely  A  is  given  by  the  same  “for¬ 
mula”  as  B.  If  H  =  L2(M),  for  example,  both  operators  might  be  given 
by  the  formula  —ih  d/dx  on  their  respective  domains.  Nevertheless,  if 
Dom(A)  ^  Dom(5),  then  A  is  still  a  different  operator  from  B. 

Proposition  9.4  An  unbounded  operator  A  is  symmetric  if  and  only  if  A* 
is  an  extension  of  A. 

Proof.  If  A  is  symmetric,  then  for  all  <p  E  Dom(A),  (9.1)  and  the  Cauchy- 
Schwarz  inequality  show  that 


\  <  WMW 11^ 


showing  that  <p  E  Dom(A*).  In  that  case,  (9.1)  shows  that  the  unique  vector 
A*</>  for  which  (0,  Apj)  =  (A*</>,  ip)  is  nothing  but  A(/>,  which  means  that  A* 
agrees  with  A  on  Dom(A). 

In  the  other  direction,  if  A*  is  an  extension  of  A,  then  for  each  <p  E 
Dom(A),  we  have 


{<P,  Aip)  =  (A*  <p,  ip)  =  (A<p,  ip) , 


for  all  ip  E  Dom(A),  which  shows  that  A  is  symmetric.  ■ 
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We  come  now  to  the  key  definition  of  this  section,  that  of  self-adjointness. 
This  notion  constitutes  the  hypothesis  of  the  spectral  theorem  for  un¬ 
bounded  operators. 

Definition  9.5  An  unbounded  operator  A  on  H  is  self-adjoint  if 

Dom(A*)  =  Dom(A) 
and  A*<p  =  A<p  for  all  <p  E  Dom(A). 

We  may  reformulate  the  definition  of  self-adjointness  by  saying  that  A 
is  self-adjoint  if  A *  is  equal  to  A ,  provided  that  equality  of  unbounded 
operators  is  understood  to  include  equality  of  domains.  Every  self-adjoint 
operator  is  symmetric  (by  Proposition  9.4),  but  there  exist  many  operators 
that  are  symmetric  without  being  self-adjoint.  In  light  of  Proposition  9.4, 
a  symmetric  operator  is  self-adjoint  if  and  only  if  Dom(A*)  =  Dom(A).  In 
trying  to  show  that  a  symmetric  operator  is  self-adjoint,  the  difficulty  lies 
in  showing  that  Dom(A*)  is  no  bigger  than  Dom(A). 

Definition  9.6  An  unbounded  operator  A  on  H  is  said  to  be  closed  if  the 
graph  of  A  is  a  closed  subset  of  H  x  H.  An  unbounded  operator  A  on  H  is 
said  to  be  closable  if  the  closure  in  H  x  H  of  the  graph  of  A  is  the  graph  of 
a  function.  If  A  is  closable,  then  the  closure  Acl  of  A  is  the  operator  with 
graph  equal  to  the  closure  of  the  graph  of  A. 

To  be  more  explicit,  an  operator  A  is  closed  if  and  only  if  the  following 
condition  holds:  Suppose  a  sequence  ipn  belongs  to  Dom(A)  and  suppose 
that  there  exist  vectors  ip  and  <p  in  H  with  ipn  ip  and  Aipn  — >■  <f>.  Then 
ip  belongs  to  Dom(A)  and  Aip  =  <p.  Regarding  closability,  an  operator  A  is 
not  closable  if  there  exist  two  elements  in  the  closure  of  the  graph  of  A  of 
the  form  (</>,  ip)  and  ( <p ,  y),  with  ip  ^  y.  Another  way  of  putting  it  is  to  say 
that  an  operator  A  is  closable  if  there  exists  some  closed  extension  of  it,  in 
which  case  the  closure  of  A  is  the  smallest  closed  extension  of  A. 

The  notion  of  the  closure  of  a  (closable)  operator  is  useful  because  it 
sweeps  away  some  of  the  arbitrariness  in  the  choice  of  a  domain  of  an 
operator.  If  we  consider,  for  example,  the  operator  A  =  —ih  d/dx  as  an 
unbounded  operator  on  L2(M),  there  are  many  different  reasonable  choices 
for  Dom(A),  including  (1)  the  space  of  C°°  functions  of  compact  support, 
(2)  the  Schwartz  space  (Definition  A.  15),  and  (3)  the  space  of  continuously 
differentiable  functions  ip  for  which  both  ip  and  ip'  belong  to  L2(M).  As  it 
turns  out,  each  of  these  three  choices  for  Dom(A)  leads  to  the  same  operator 
Acl .  Note  that  we  are  not  claiming  that  every  choice  for  Dom(A)  leads  to 
the  same  closure;  nevertheless,  it  is  often  the  case  that  many  reasonable 
choices  do  lead  to  the  same  closure. 

Definition  9.7  An  unbounded  operator  A  on  H  is  said  to  be  essentially 
self-adjoint  if  A  is  symmetric  and  closable  and  Acl  is  self-adjoint. 
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Actually,  as  we  shall  see  in  the  next  section,  a  symmetric  operator  is 
always  closable.  Many  symmetric  operators  fail  to  be  even  essentially  self- 
adjoint.  We  will  see  examples  of  such  operators  in  Sects.  9.6  and  9.10.  Sec¬ 
tion  9.5  gives  some  reasonably  simple  criteria  for  determining  when  a  sym¬ 
metric  operator  is  essentially  self-adjoint. 


9.3  Elementary  Properties  of  Adjoints  and  Closed 
Operators 

In  this  section,  we  spell  out  some  of  the  most  basic  and  useful  properties 
of  adjoints  and  closures  of  unbounded  operators.  In  Sect.  9.5,  we  will  draw 
on  these  results  to  prove  some  more  substantial  results.  In  what  follows, 
if  we  say  that  two  operators  “coincide,”  it  means  that  they  have  the  same 
domain  and  that  they  are  equal  on  that  common  domain. 

Proposition  9.8  1.  If  A  is  an  unbounded  operator  on  H,  then  the 

graph  of  the  operator  A *  (which  may  or  may  not  be  densely  defined) 
is  closed  in  H  x  H. 

2.  A  symmetric  operator  is  always  closable. 

Proof.  Suppose  jjn  is  a  sequence  in  the  domain  of  A *  that  converges  to 
some  £  H.  Suppose  also  that  A*jjn  converges  to  some  <fi  £  H.  Then 
(gfn ,  A')  =  (A*ifjni  •)  and  for  any  \  £  Dom(A),  we  have 

(‘4>,Ax)=  lim  (ipn,Ax)=  lim  (A*ipn,  \)  =  (<P,  x)  ■ 

n— t>oo  n— >•  oo 

This  shows  that  belongs  to  the  domain  of  A*  and  that  A* if  =  </>,  estab¬ 
lishing  that  the  graph  of  A *  is  closed. 

If  A  is  symmetric,  A *  is  an  extension  of  A.  Since,  as  we  have  just  proved, 
A *  is  closed,  A  has  a  closed  extension  and  is  therefore  closable.  ■ 

Corollary  9.9  If  A  is  a  symmetric  operator  with  Dom(A)  =  H,  then  A  is 
bounded. 

Proof.  Since  A  is  symmetric,  it  is  closable  by  Proposition  9.8.  But  since 
the  domain  of  A  is  already  all  of  H,  the  closure  of  A  must  coincide  with 
A  itself.  (The  closure  of  A  always  agrees  with  A  on  Dom(A),  which  in  this 
case  is  all  of  H.)  Thus,  A  is  a  closed  operator  defined  on  all  of  H,  and  the 
closed  graph  theorem  (Theorem  A. 39)  implies  that  A  is  bounded.  ■ 

Proposition  9.10  If  A  is  a  closable  operator  on  H,  then  the  adjoint  of 
Acl  coincides  with  the  adjoint  of  A. 

Proof.  Suppose  that  for  some  £  H  there  exists  a  such  that  Acl\)  = 
(</>,  x)  f°r  all  A  £  Dom(A^).  Since  Acl  is  an  extension  of  A,  it  follows 
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that  (ip,Ax)  =  (</>,  x)  f°r  all  X  £  Dom(A).  This  shows  that  Dom(A*)  D 
Dom((Ad)*)  and  that  A*  agrees  with  (Ad)*  on  Dom((Ad)*). 

In  the  other  direction,  suppose  for  some  ip  E  H  there  exists  a  (p  such 
that  (pj,  A%)  =  (</>,  x)  f°r  aU  X  £  Dom(A).  Suppose  now  £  E  Dom(Ad)  with 
Acl £  =  rj.  Then  there  exists  a  sequence  Xn  hr  Dom(A)  with  Xn  f  and 
Axn  and  we  have 

(i’^Xn)  =  ( <P,Xn > 

for  all  n.  Letting  n  tend  to  infinity,  we  obtain  pip,  rj)  =  (0,  £),  or  U,Acl£)  = 
(</>,£).  This  shows  that  pj  E  Dom((Ad)*)  and  Aclp)  =  <p.  Thus,  Dom(A*)  C 
Dom((Ad)*).  ■ 

Proposition  9.11  If  A  is  essentially  self-adjoint,  then  Acl  is  the  unique 
self-adjoint  extension  of  A. 

Proof.  Suppose  B  is  a  self-adjoint  extension  of  A.  Since  B  =  B* ,  B  is  closed 
and  is,  therefore,  an  extension  of  Acl .  It  then  follows  from  the  definition  of 
the  adjoint  that  Dom(L>*)  c  Dom(Ad).  Thus,  we  have 

Dom(L>*)  C  Dom(Ad)  C  Dom(L>). 

Since  B  is  self-adjoint,  all  three  of  the  above  sets  must  be  equal,  so  actually 
B  =  Acl .  m 

Proposition  9.12  If  A  is  an  unbounded  operator  on  H,  then 

(Range(A))d  =  ker(A*). 

Proof.  First  assume  that  pj  E  (Range(A))d.  Then  for  all  <p  E  Dom(A)  we 
have 

(Vb  M)  =  o. 

That  is  to  say,  the  linear  functional  (pr,A-)  is  bounded — in  fact,  zero- 
on  Dom(A).  Thus,  from  the  definition  of  the  adjoint,  we  conclude  that 
ip  E  Dom(A*)  and  A*ip  =  0. 

Meanwhile,  suppose  that  ip  is  in  Dom(A*)  and  that  A*ip  =  0.  The  only 
way  this  can  happen  is  if  the  linear  functional  (ip,A-)  is  zero  on  Dom(A), 
which  means  that  ip  is  orthogonal  to  the  image  of  A.  ■ 

Proposition  9.13  Suppose  A  is  an  unbounded  operator  on  H  and  that  B 
is  a  bounded  operator  defined  on  all  of  H.  Let  A  +  B  denote  the  operator 
with  Dom(A  +  B)  =  Dom(A)  and  given  by  (A  +  B)ip  =  Aip  +  Bip  for  all 
pj  E  Dom(A).  Then  ( A  +  B )*  has  the  same  domain  as  A*  and  (A -\-B)*pj  = 
A* pi  +  B*pj  for  all  pj  E  Dom(A*). 

In  particular,  the  sum  of  an  unbounded  self-adjoint  operator  and  a 
bounded  self-adjoint  operator  (defined  on  all  of  H)  is  self-adjoint  on  the 
domain  of  the  unbounded  operator. 
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Proof.  See  Exercise  3.  ■ 

The  sum  of  two  unbounded  self-adjoint  operators  is  not,  in  general,  self- 
adjoint.  See  Sect.  9.9  for  more  information  about  this  issue. 

Proposition  9.14  Let  A  be  a  closed  operator  and  X  an  element  of  C. 
Suppose  that  there  exists  £  >  0  such  that 


II  (A  —  \I)Tp\\  >  £  \\if 


(9.2) 


for  all  A  in  Dom(A).  Then  the  range  of  A  —  XI  is  a  closed  subspace  of  H. 

Here,  we  take  the  domain  of  the  operator  A  —  XI  to  coincide  with  the 
domain  of  A,  as  in  Proposition  9.13. 

Proof.  Assume  that  <fn  is  a  sequence  in  the  range  of  A  —  XI  converging 
to  some  <f>.  Then  <f>n  =  (A  —  XI)ifn,  for  some  sequence  ifn  in  Dom(A).  Ap¬ 
plying  (9.2)  with  lf  =  lfn  —  lfm  shows  that  ||^n  —  lfm\\  A  (1/e)  ||  4>n  —  4>m  ||* 
This  means  that  ifn  is  Cauchy  and  thus  convergent  to  some  vector  if.  Since 
ifn^rif  and  (A  —  XI)ifn  =  <fn  we  have  that 


Alfn  —  X lfn  +  (fn  ~ ^  +  <f- 


Thus,  by  the  definition  of  a  closed  operator,  if  E  Dom(A)  and  Aif  =  Xip  +  cf. 
This  means  that  (A  —  XI)if  =  <f  and  so  the  range  of  A  —  XI  is  closed.  ■ 
We  conclude  this  section  with  a  simple  example  for  which  we  can  compute 
the  adjoint  and  closure  explicitly. 


Example  9.15  Let  (ef)  be  an  orthonormal  basis  for  H  and  let  (X f)  be 
an  arbitrary  sequence  of  real  numbers.  Define  an  operator  A  on  H  with 
Dom(A)  equal  to  the  space  of  finite  linear  combinations  of  the  ej ’s,  with  A 
itself  defined  by 


Then  A  is  symmetric  and  closable  and  Dom(A*)  =  Dom(Ac/)  =  V ,  where 


y=  v>  =  E 


ctj  ej 


3 


^(1  +  A|)  \aj\2  <  oo 


3 


For  any  if  =  ajej  i n  we  have 


A*  if  =  Aclif  =  (2  j  A  j  e  - 


Thus ,  ( Acl)*  =  A*  =  Acl ,  showing  that  A  is  essentially  self-adjoint. 

Proof.  Note  that  for  any  sequence  (aj)  of  coefficients  satisfying  the  condi¬ 
tion  on  the  right-hand  side  of  (9.3),  we  have  )T\  \aj\  <  oo  and,  thus,  the 
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sum  ^ZjCLj^j  converges  in  H.  Suppose  first  that  p  =  a  jej  belongs  V. 

Then  for  any  p  =  JT  b jej  (finite  sum)  in  the  domain  of  A  we  have 

3 

and  so  by  the  Cauchy-Schwarz  inequality, 


I  < 


Thus,  (p,A-)  is  a  bounded  linear  functional,  showing  that  p  G  Dom(A*). 
Furthermore,  it  is  apparent  that  (p,Ap)  =  (y,  p)  for  all  p  G  Dom(A), 
where  y  =  JT  a3  A j  e3 . 

Meanwhile,  suppose  p  =  )TU  a  jej  belongs  to  the  domain  of  A*,  and 
consider  pjy  :=  A jCtjej  in  Dom(A).  Then 


N 


{<i>,AipN)\  =  y 

3  = 1 


N  x2 


Since  p  G  Dom(A*),  the  functional  (</>,  A-)  is  bounded,  and  so  JT=1  A 
must  be  bounded,  independent  of  AT,  and  so  JT  A2 
belongs  to  H,  we  have  also  that  JT  |%- 


a. 


3  = 1  '  \7 

<  00.  Since 


<  oo,  showing  that  p  is  in  V. 
Turning  now  to  the  closure  of  A,  it  is  apparent  that  A  is  symmetric  and 
thus  closable,  by  Proposition  9.8.  Suppose  ip  =  JT  a  jej  belongs  to  V  and 

consider  pjy  := 


3- 


:  1  ajej 


Clearly,  pjy  converges  to  p.  Furthermore,  since 
p  G  V,  we  see  that  Apjy  converges  to  the  vector  a j  Aj  .  This  shows 

that  p  G  Dom(Ac/)  and  that  Aclp  =  JT  ajXjej.  Thus,  each  element  of  V 

belongs  to  Dom(Ac/)  and  Acl  is  given  on  V  by  (9.4). 

Now,  the  space  V  forms  a  Hilbert  space  with  respect  to  the  norm  given 
by 


nl  —  + tf) 


a. 


where  p  =  JN  a3  e3 .  [To  establish  completeness  of  V  with  respect  to  this 
norm,  note  that  V  can  be  identified  isometrically  with  L2(N)  with  respect 
to  the  measure  fi  for  which  ^({j})  =  1  +  A2.]  Suppose,  now,  that  we  have  a 
sequence  (p171)  in  Dom(A)  for  which  both  ppm)  and  (Apm)  are  convergent. 
Then  (p™)  forms  a  Cauchy  sequence  in  V  which  converges  to  some  element 
p  of  V.  Since  \\p\\u  —  W'&Wv  f°r  all  V7  C  Dom(A),  we  see  that  pm  also 
converges  in  H  to  p  G  V.  This  shows  that  each  element  of  Dom(Ac/) 
belongs  to  V.  m 


9.4  The  Spectrum  of  an  Unbounded  Operator 


177 


9.4  The  Spectrum  of  an  Unbounded  Operator 

Recall  that  if  A  is  a  bounded  operator,  then  a  number  A  E  C  belongs  to 
the  resolvent  set  of  A  if  the  operator  A  —  XI  has  a  bounded  inverse,  and  A 
belongs  to  the  spectrum  of  A  if  A  —  XI  does  not  have  a  bounded  inverse. 
For  an  unbounded  operator  A ,  we  will  say  that  a  number  A  G  C  is  in  the 
resolvent  set  of  A  if  A  —  XI  has  a  bounded  inverse.  That  is,  even  though 
A  is  unbounded,  for  A  to  be  in  the  resolvent  set  of  A,  there  must  be  a 
bounded  inverse  to  A  —  XI;  otherwise,  A  is  in  the  spectrum  of  A.  We  make 
this  characterization  more  precise  in  the  following  definition. 

Definition  9.16  Suppose  A  is  an  unbounded  operator  on  H.  A  number 
X  E  C  belongs  to  the  resolvent  set  of  A  if  there  exists  a  bounded  operator 
B  with  the  following  properties:  (1)  For  all  if  E  H,  Bif  belongs  to  Dom(A) 
and  ( A—XI)Bif  =  if,  and  (2)  for  all  if  E  Dom(A)  we  have  B(A—XI)if  =  if. 
If  no  such  bounded  operator  B  exists,  then  X  belongs  to  the  spectrum  of  A. 

Note  that  we  are  implicitly  taking  Dom(A  —  XI)  to  equal  Dom(A),  as  in 
Proposition  9.13.  As  in  the  bounded  case,  even  if  A  is  self-adjoint,  points 
A  in  the  spectrum  of  A  are  not  necessarily  eigenvalues;  that  is,  there  does 
not  necessarily  exist  a  nonzero  if  E  Dom(A)  with  Aif  =  A  if.  On  the  other 
hand,  if  Aif  =  A  if  for  some  if  E  Dom(A),  then  A  —  XI  is  not  injective  and 
thus  A  certainly  does  belong  to  the  spectrum  of  A. 

Theorem  9.17  If  A  is  an  unbounded  self-adjoint  operator  on  H,  the  spec¬ 
trum  of  A  is  contained  in  the  real  line. 

If  A  is  symmetric  but  not  self-adjoint,  then  the  spectrum  of  A  must 
contain  points  not  in  the  real  line.  Indeed,  Theorem  9.21  will  show  that  at 
least  one  of  (A  —  il)  and  (A  +  il)  must  fail  to  be  surjective,  and  thus  at 
least  one  of  the  numbers  i  and  —i  is  in  the  spectrum  of  A.  Nevertheless,  a 
symmetric  operator  cannot  have  nonreal  eigenvalues,  as  we  showed  already 
in  Proposition  3.4. 

Proof.  Consider  a  complex  number  A  =  a  +  ib  with  6  ^  0.  Since  A  is 
symmetric,  the  proof  of  Lemma  7.8  applies,  giving 

((A  -  A/)V>,  {A  -  A/)V>)  >  b2  {tP,  V>)  (9.5) 

for  all  if  E  Dom(A).  This  shows  that  ( A  —  XI)  is  injective. 

Meanwhile,  applying  Propositions  9.12  and  9.13  with  B  =  —XI  we  see 
that 

(Range(A  —  XI))1-  =  ker((A  —  A/)*)  =  ker(A*  —  XI)  =  ker(A  —  XI). 

Since  A  again  has  nonzero  imaginary  part,  A  —  XI  is  also  injective,  showing 
that  Range(A  —  XI)  is  dense  in  H.  Since  A  =  A*  is  closed,  (9.5)  allows  us 
to  apply  Proposition  9.14  to  show  that  Range(A  —  XI)  is  closed,  hence  all 
of  H. 
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We  have  shown,  then,  that  ( A  —  XI)  maps  Dom(A)  injectively  onto  H.  It 
follows  from  (9.5)  (or  the  closed  graph  theorem)  that  the  inverse  operator 
is  bounded,  so  that  A  is  in  the  resolvent  set  of  A.  m 

Our  next  result  shows  that  the  spectrum  of  an  unbounded  self-adjoint 
operator  has  properties  similar  to  that  of  a  bounded  self-adjoint  operator. 

Proposition  9.18  If  A  is  an  unbounded  self-adjoint  operator  on  H,  then 
the  following  hold. 

1.  A  number  A  G  R  belongs  to  the  spectrum  of  A  if  and  only  if  there 
exists  a  sequence  fjn  of  nonzero  vectors  in  Dom(A)  such  that 


lim 

n— >-  oo 


|(A-AJ)^n 

||  ^Pn  || 


2.  The  spectrum  a  (A)  of  A  is  a  closed  subset  o/M. 

Although  the  spectrum  of  a  bounded  self-adjoint  operator  is  a  bounded 
subset  of  R,  the  spectrum  of  an  unbounded  self-adjoint  operator  will  be 
unbounded.  Indeed,  it  can  be  shown  (using  the  spectral  theorem)  that  if 
a  self-adjoint  operator  has  bounded  spectrum,  then  the  operator  must  be 
bounded. 

Proof.  For  Point  1,  if  a  sequence  as  in  (9.6)  existed,  then  as  in  the  proof 
of  Proposition  7.7,  A  —  XI  could  not  have  a  bounded  inverse,  so  A  must  be 
in  the  spectrum  of  A.  Conversely,  suppose  no  such  sequence  exists.  Then 
there  is  some  £  >  0  such  that 


II {A  —  A/)^||  >  e  ||^ 


(9.7) 


for  all  fj  G  Dom(A).  This  means  that  A  —  XI  is  injective  and  that,  by 
Proposition  9.14,  the  range  of  A  —  XI  is  closed.  But 


(. A-XI)*  =  A*  -XI  =  A-  XI 

and  A  —  XI  is  injective,  so  by  Proposition  9.12,  the  range  of  A  —  XI  is  all 
of  H.  This  means  A  —  XI  has  an  inverse,  which  is  bounded  by  (9.7).  Thus 
A  is  not  in  the  spectrum  of  A. 

Point  2  is  left  as  an  exercise  (Exercise  4).  ■ 

Definition  9.19  Let  A  be  an  unbounded  operator  on  H.  Then  A  is  non¬ 
negative  if  ftp,  Ajj)  >  0  for  all  ip  G  Dom(A)  and  A  is  hounded  below  by 

c  G  R  if  pip,  A'tp)  >  c  HV’II2  for  all  if  G  Dom(A). 

Proposition  9.20  Let  A  be  an  unbounded  self-adjoint  operator  on  H.  If 
A  is  non-negative ,  then  the  spectrum  of  A  is  contained  in  [0,  oo).  More 
generally,  if  A  is  bounded  below  by  c,  then  the  spectrum  of  A  is  contained 
in  [c,  oo). 


9.5  Conditions  for  Self- Adjointness  and  Essential  Self- Adjointness 
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We  will  eventually  see,  using  the  spectral  theorem  for  unbounded  self- 
adjoint  operators,  that  the  converse  to  Proposition  9.20  also  holds:  If  the 
spectrum  of  a  self-adjoint  operator  A  is  contained  in  [0,  oo),  then  A  is  non¬ 
negative,  and  if  the  spectrum  of  A  is  contained  in  [c,  oo),  then  A  is  bounded 
below  by  c.  These  results  follow  easily,  for  example,  from  the  form  of  the 
spectral  theorem  in  Theorem  10.9. 

Proof.  Suppose  A  is  bounded  below  by  c  and  A  is  a  point  in  the  spectrum 
of  A.  If  ipn  be  a  sequence  as  in  Point  1  of  Proposition  9.18,  with  the  ^n’s 
normalized  to  be  unit  vectors,  then 


lim 

n— >■  oo 


{ipn,  (A  ~  \I)i>n 


<  lim 

n— 7>oo 


||  (A  -  XI)lpn 


On  the  other  hand,  A  =  XI  +  (A  —  A/),  and  so 


(VU,  A^n)  =  A  +  (vpn ,  ( A  -  XI)7pn)  . 

Thus,  Aipn)  converges  to  A  (=  A  (%pn,  n ))  as  n  tends  to  infinity.  Since 
A  is  bounded  below  by  c,  we  must  have  A  >  c.  This  establishes  the  result 
for  operators  bounded  below  by  c.  Specializing  to  c  =  0  gives  the  result  for 
non-negative  operators.  ■ 


9.5  Conditions  for  Self-Adjointness  and  Essential 
Self-Adjointness 

In  this  section,  we  give  criteria  for  determining  whether  a  symmetric  oper¬ 
ator  is  self-adjoint  or  essentially  self-adjoint.  See  also  Sect.  10.2  for  the  con¬ 
nection  between  self-adjoint  operators  and  one-parameter  unitary  groups. 

Theorem  9.21  If  A  is  a  symmetric  operator  on  H,  then  A  is  essentially 
self-adjoint  if  and  only  if  Range  (A  —  il)  and  Range  (A  +  il)  are  dense 
subspaces  of  H. 

Using  Proposition  9.12,  we  can  reformulate  this  result  as  follows. 

Corollary  9.22  If  A  is  a  symmetric  operator  on  H,  then  A  is  essentially 
self-adjoint  if  and  only  if  the  operators  A*  +  il  and  A*  —  il  are  injective 
on  Dom(A*). 

As  Exercise  11  shows,  it  is  possible  to  have  one  of  the  operators  A*  +  il 
and  A*  —  il  be  injective  and  the  other  fail  to  be  injective. 

Proof  of  Theorem  9.21.  Assume  first  that  A  is  essentially  self-adjoint, 
so  that  Acl  is  self-adjoint.  Then  A*  =  (Ac/)*  =  Ad,  and  so 

[Range(A  —  il)]±  =  ker(A*  +  il)  =  ker(Ad  +  il)  =  {0}, 
by  Theorem  9.17,  and  similarly  for  the  range  of  A  +  il. 
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Conversely,  assume  A  is  symmetric  and  that  A  —  il  and  A  +  il  both 
have  dense  range.  Since  ( Acl )*  =  A*  is  a  closed  extension  of  A ,  it  is  also 
an  extension  of  Ac\  showing  that  Acl  is  symmetric.  We  may  then  apply 
Lemma  7.8 — the  proof  of  which  requires  only  symmetry — to  the  operator 
Acl  with  A  =  i,  giving 


(. Acl  —  i/)'0ir  >  || if 


and  showing  that  Acl  —  il  is  injective.  Since  the  range  of  A  —  il  is  dense, 
the  range  of  Acl  —  il  is  certainly  also  dense.  But  since  Acl  is  closed,  (9.8) 
and  Proposition  9.14  tell  us  that  the  range  of  Acl  —  il  is  closed,  hence  all 
of  H.  Similar  reasoning  shows  that  the  range  of  Acl  +  il  is  also  all  of  H. 

Now,  by  Proposition  9.13,  ( Acl  —  il)*  =  (Ad)*  +  z/,  which  is  an  extension 
of  Acl  +  il.  Suppose  (. Acl )*  +  il  is  a  proper  extension  of  Acl  +  i/,  that  is, 
that  the  domain  of  (. Acl )*  -\-H  is  strictly  bigger  than  the  domain  of  Acl  -\-H. 
Then  since  Acl  +  il  already  maps  onto  H,  ( Acl )*  +  il  cannot  be  injective. 
Thus,  the  operator 


(. Acly  +il  =  A*  +il  =  (A-  il)* 


must  have  a  nontrivial  kernel.  Then  by  Proposition  9.12,  Range(A  —  il)  is 
not  dense,  contradicting  our  assumptions. 

We  conclude,  therefore,  that  (Ac/)*  +  il  is  not  a  proper  extension  of 
Acl  +  i/,  i.e.,  that  ( Acl )*  +  il  =  Acl  +  il  (with  equality  of  domains).  This, 
by  Proposition  9.13,  means  that  ( Acl )*  =  A*  (with  equality  of  domains), 
which  is  what  we  are  trying  to  prove.  ■ 


Proposition  9.23  If  A  is  a  symmetric  operator  on  H,  then  A  is  self- 
adjoint  if  and  only  if 


Range(A  —  il)  =  Range(A  +  il)  =  H. 


Proof.  Suppose  first  that  A  is  self-adjoint.  Then  by  Theorem  9.21,  the 
ranges  of  A  —  il  and  A  +  il  are  dense  in  H.  On  the  other  hand, 


|| (A  —  il)if ||2  >  || if 


by  (the  proof  of)  Lemma  7.8,  with  A  =  i.  Since,  also,  A  =  A*  is  closed, 
Proposition  9.14  tells  us  that  the  range  of  A  —  il  is  closed,  hence  all  of  H. 
A  similar  argument  shows  that  the  range  of  A  +  il  is  all  of  H. 

Conversely,  suppose  that  the  ranges  of  A  —  il  and  A  +  il  are  all  of  H. 
Then  A  is  essentially  self-adjoint  by  Theorem  9.21,  so  that  A*  is  self-adjoint. 
Since  A  —  il  already  maps  onto  H,  if  A*  were  a  nontrivial  extension  of  A, 
then  A*  —il  could  not  be  injective.  But  (9.9),  with  A  replaced  by  A*,  shows 
that  A*  —  il  is  injective.  Thus,  A  =  A*  and  so  A  is  self-adjoint.  ■ 

In  the  case  that  A  is  positive-semidefinite  (i.e.,  (if,Aif)  >  0  for  all  if  E 
Dom(A)),  there  is  another  self-adjointness  condition,  the  proof  of  which  is 
very  similar  to  that  of  Theorem  9.22. 
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Theorem  9.24  Suppose  that  A  is  a  symmetric  operator  on  H  and  that 
(ip,Aif)  >  0  for  all  if  E  Dom(A).  Then  A  is  essentially  self-adjoint  if  and 
only  if  A  +  I  has  dense  range.  Equivalently ,  A  is  essentially  self-adjoint  if 
and  only  if  A*  -\- 1  is  injective. 

Proof.  Assume  first  that  A  is  essentially  self-adjoint.  Then  (A  +  /)*  = 
A*  + 1  =  Acl  + 1.  It  is  easily  seen  that  Acl  is  also  positive  definite,  and  so 

{ip,  ( Acl  +  I)ip)  =  {ip,  ip)  +  {ip,  Aclip )  >  {ip,  ip)  (9.10) 

Thus,  Acl  + 1  =  (A  +  /)*  is  injective.  Thus,  the  range  of  A  +  I  is  dense,  by 
Proposition  9.12. 

Now  assume  that  A+I  has  dense  range.  By  (9.10),  Acl  +/  is  injective  and 
by  (9.10)  and  Proposition  9.14,  the  range  of  Acl  +1  is  closed,  hence  all  of  H. 
Assume  Dom(A*)  is  strictly  larger  than  Dom(Ac/).  Then  because  Acl  +/  is 
already  surjective,  A*  + 1  (which  has  a  domain  equal  to  the  domain  of  A*) 
cannot  be  injective.  Thus,  A*  -f  I  =  (A  +  /)*  has  a  nontrivial  kernel,  which 
means  that  the  range  of  A  +  I  is  not  dense.  This  is  a  contradiction,  and 
so  the  domain  of  A*  must  actually  be  equal  to  the  domain  of  Acl .  Since  A 
and  so  also  Acl  are  symmetric,  this  means  that  Acl  is  self-adjoint.  ■ 

Example  9.25  Suppose  that  A  is  a  symmetric  operator  on  H  that  has 
an  orthonormal  basis  of  eigenvectors.  That  is  to  say,  suppose  there  is  an 
orthonormal  basis  {ej}  for  H  such  that  for  each  j ,  we  have  ej  E  Dom(A) 
and  Aej  =  Xjej  for  some  real  number  A  j.  Then  A  is  essentially  self-adjoint. 

This  result  is  a  strengthening  of  Example  9.15,  in  that  we  do  not  assume 
that  the  domain  of  A  is  equal  to  the  space  of  finite  linear  combinations  of 
the  e^’s. 

Proof.  For  any  j,  (A  —  il)ej  =  (A  j  —  i)ej.  Since  A  j  is  real,  we  have  a 
nonzero  multiple  of  ej  belonging  to  Range(A  —  il),  for  each  j.  This  shows 
that  Range(A  —  il)  is  dense,  and  similarly  for  Range(A  +  il).  ■ 

Example  9.26  Suppose  H  is  a  Hilbert  space  direct  sum  of  a  sequence  of 
separable  Hilbert  spaces  H j : 


oo 

H  =  ®H,, 

3  = 1 

Suppose  also  that  Aj  is  a  bounded  self-adjoint  operator  on  H j,  for  each  j . 
Define  a  subspace  V  o/H  by 


V  =  <1p  =  {lpi,1p2,...) 


OO 

E  (ill'll?  +  w^A)  < 00 

3= 1 


Suppose  now  that  A  is  a  symmetric  operator  on  H  whose  domain  contains 
the  finite  direct  sum  of  the  EL,-  ’s  and  such  that  A 


H  =  Aj.  Then  A  is 
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essentially  self-adjoint,  Dom (Acl)  =  Dom(A*)  =  V ,  and 


Adip  =  =  (Aiipi,  A2ip2,  ■  •  •) 


(9.11) 


for  all  0  =  (0i,  02,  •  •  •)  in  V . 


See  Definition  A. 45  for  the  definition  of  the  Hilbert  direct  sum  and  the 
finite  direct  sum  of  a  sequence  of  Hilbert  spaces.  Example  9.25  is  the  special 
case  of  Example  9.26  in  which  each  HL,  has  dimension  1.  This  result  will 
be  useful  to  us  in  Chap.  10. 

Proof.  Since  Aj  is  self-adjoint,  the  ranges  of  Aj  —  il  and  Aj  +  il  are 
dense  in  H j.  Thus,  the  closure  of  the  range  of  A  —  il  contains  each  H j 
and  is  therefore  dense  in  H,  and  similarly  for  A  -j-  il.  This  shows  that  A  is 
essentially  self-adjoint. 

It  remains  to  show  that  the  domain  of  A*  =  Acl  is  V.  Let  W  denote  the 
finite  direct  sum  of  the  H0s.  By  the  argument  in  the  previous  paragraph, 
A\w  is  essentially  self-adjoint.  Then  A*  is  a  symmetric  extension  of  ( A\w)* , 
which  must  coincide  with  (A\w)*.  Thus,  it  suffices  to  consider  the  case 
Dom(A)  =  W. 

If  we  assume  that  Dom(A)  =  W,  we  can  compute  the  adjoint  of  A  by  the 
argument  in  Example  9.15.  If  0  G  V,  then  the  Cauchy-Schwarz  inequality 
shows  that  the  linear  functional  (0,  A-)  is  bounded  and  that  A*0  is  as 
(9.11).  On  the  other  hand,  if  (</>,  A-)  is  bounded,  where  </>  =  (</>i,  </>2,  •  •  •)? 
take 

'IpN  =  (01,  ^2,  •  •  •  ,  0AT ,  0,  0,  .  .  .). 

Then,  as  in  the  proof  of  Example  9.15,  the  only  way  we  can  have  |  (0,  Aip^)  \  < 
C  ||0_/v ||  is  if  0  belongs  to  V.  ■ 


9.6  A  Counterexample 

In  this  section,  we  will  examine  an  elementary  example  of  an  operator  that 
is  symmetric  but  not  essentially  self-adjoint.  Our  example  will  be  essen¬ 
tially  the  momentum  operator  on  a  finite  interval,  with  “wrong”  boundary 
conditions.  (A  more  sophisticated  example  is  given  in  Sect.  9.10.)  We  take 
our  Hilbert  space  to  be  L2([0, 1]). 

Proposition  9.27  Let  Dom(A)  C  L2([0,1])  be  the  space  of  continuously 
differentiable  functions  f  on  [0, 1]  satisfying 

0(0)  =  0(1)  =0. 


For  0  G  Dom(A),  define 


A0  =  —ih 


dip 

dx 


Then  A  is  symmetric  but  not  essentially  self-adjoint. 
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We  can  understand  the  failure  of  essential  self-adjointness  of  A  in  prac¬ 
tical  terms  as  a  failure  of  the  spectral  theorem.  The  eigenvector  equation 
Aip  =  A  ip  for  A  G  R  is  a  first-order  ordinary  differential  equation,  whose 
general  solution  is  p(x)  =  celXx ,  where  c  is  a  constant.  The  only  way  such  a 
function  can  satisfy  the  boundary  conditions  p(0)  =  p(l)  =  0  is  if  c  =  0,  in 
which  case  ip  is  the  zero  vector.  Thus,  A  has  no  eigenvectors.  Furthermore, 
taking  the  closure  of  A  does  not  help,  because,  as  the  proof  will  show,  the 
boundary  conditions  survive  taking  the  closure. 

Proof  of  symmetry.  Using  integration  by  parts  we  see  that  for  all  (p  and 
ip  in  Donr(A)  we  have 


dx  =  p(l)p{l)  —  p(0)p(0) 


dp 

dx 


p(x)  dx. 


(9.12) 


Since  we  assume  p  and  ip  are  in  Dom(A),  the  boundary  terms  are  zero  and 
we  get 


dip  \ 

dx  / L2 ([0,1]) 


Because  there  is  a  conjugate  in  one  side  of  the  inner  product  but  not  the 
other,  it  follows  that 


</>,  —ih 


dip  \ 


dx 


i 


LHl  0,1]) 


1 


as  claimed.  ■ 

We  now  consider  Acl  and  A *  =  (Ac/)*.  We  will  see  that  there  are  elements 
of  the  domain  of  the  adjoint  that  are  not  in  the  domain  of  the  closure. 

Lemma  9.28  If  p  is  a  continuously  differentiable  function  on  [0, 1],  then 
p  E  Dom(A*)  and  A*p  =  —ih  dp/dx. 

Proof.  If  p  is  continuously  differentiable,  then  for  any  p  in  Dom(A),  we 
may  integrate  by  parts  as  in  (9.12).  Since  ip  is  zero  at  both  ends  of  the 
interval,  the  boundary  terms  vanish  and  we  obtain 


(p,Aip)  =  ih  I  ^-p{x)  dx 


'o 


dx 


o 


ih—y—  )ip(x)  dx 
dx 


(9.13) 


Since  dp/dx  is  continuous  and  hence  in  L2([0, 1]),  we  see  that  (9.13)  is  a 
continuous  linear  functional,  as  a  function  of  p  with  fixed  p.  Thus,  p  is  in 
the  domain  of  A*,  and  A*p  =  —  i  dp/dx.  ■ 

Proof  of  Proposition  9.27.  Suppose  p  is  in  the  domain  of  Acl .  Then 

there  exist  pn  in  Dom(A)  such  that  pn  converges  to  p  and  Apn  converges 
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to  some  x  £  T2([0, 1]).  Since  the  derivatives  of  the  ^n’s  are  converging  in 
P2,  the  ipnS  themselves  must  be  converging  uniformly,  as  can  be  shown  by 
writing  each  ipn  as  the  integral  of  its  derivative.  (See  Exercise  10.)  It  follows 
that  every  element  of  Dom(Ac/)  is  continuous  and  vanishes  at  both  ends  of 
the  interval.  On  the  other  hand,  Dom(A*)  contains  all  smooth  functions, 
including  many  that  do  not  vanish  at  the  ends  of  the  interval.  Thus,  Acl 
and  ( Acl )*  =  A *  do  not  have  the  same  domains.  ■ 

It  follows  from  Lemma  9.28  that  every  complex  number  A  belongs  to  the 
spectrum  of  Acl .  See  Exercise  9. 

The  reason  that  A  fails  to  be  essentially  self-adjoint  is  that  we  impose  too 
many  boundary  conditions  on  functions  in  the  domain  of  A,  which  results 
in  there  being  too  few  boundary  conditions  (in  this  case,  no  boundary 
conditions  at  all)  on  functions  in  the  domain  of  A* .  In  this  example,  A *  is 
given  by  the  same  formula  as  A  (— id/dx  in  both  cases),  but  the  domain  of 
A *  is  bigger  than  the  domain  of  Acl . 

Suppose  we  define  another  operator  B ,  still  given  by  the  formula  —i  d/dx , 
but  with  the  domain  of  B  to  be  the  space  of  continuously  differentiable 
functions  ip  with  -0(0)  =  *0(1) .  If  we  integrate  by  parts  as  in  (9.12),  the 
boundary  terms  will  cancel,  showing  that  B  is  symmetric.  Meanwhile,  the 
functions  ipn(x)  :=  e27Tinx ,  n  E  Z,  form  an  orthonormal  basis  for  P2([0, 1]) 
consisting  of  eigenvectors  for  B ,  with  real  eigenvalues  An  =  27m.  Thus,  by 
Example  9.25,  B  is  essentially  self-adjoint. 


9.7  An  Example 

We  now  give  an  example  of  an  operator  that  is  essentially  self-adjoint.  Let 
Cf°  (M)  denote  the  space  of  smooth,  compactly  supported  functions  on  M. 

Proposition  9.29  Let  P  be  the  densely  defined  operator  with  Dom(P)  = 
Cf°(R)  C  L2(M)  and  given  by  Pip  =  —  ih  dip/dx.  Then  P  is  essentially 
self-adjoint. 

Proof.  Our  strategy  is  to  apply  Corollary  9.22.  Since  P  is  symmetric,  we 
expect  that  P*  will  be  given  by  the  formula  —ih  d/dx ,  on  some  suitable 
domain  inside  L2(M).  Thus,  if  if  E  ker(P*  +  iP ) ,  this  should  mean  that 
— ih  dip/dx  =  —  iip,  or  dip / dx  =  (1  /K)ip(x),  which  ought  to  imply  that 
ip{x)  =  cex!h ,  for  some  constant  c.  Since  cex^n  belongs  to  L2(M)  only  if 
c  =  0,  we  hope  to  conclude  that  ip  =  0. 

To  say  that  ip  E  P2(M)  belongs  to  the  kernel  of  P*  +  il  means  that  ip 
belongs  to  Dom(P*)  and  that  P*ip  =  —iip.  This  holds  if  and  only  if 


ih 


dx 

dx 


ip(x)  dx  =  i  x(x)'0(x)  dx 
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for  all  x  £  C£°(R).  For  any  £  E  C^°(R),  if  we  take  x(^)  =  £(#)e  and 
combine  the  integrals  into  one,  we  get 


he~x/ndf  -  e~x/ni(x)  +  e~x/n^x) 


dx 


0(x)  dx 


e  x/nv/j(x)  dx. 
dx 


(9.14) 


Now,  (9.14)  says  that  the  derivative  of  e~x/hrf>(x)  in  the  weak  or  distribu¬ 
tional  sense  is  zero.  (See  Proposition  A. 29  in  Appendix  A. 3. 3.)  Thus,  by  the 
remarks  immediately  following  Proposition  A. 5,  we  must  have  e~x^hfj(x)  = 
c  for  some  c,  meaning  that  0(x)  =  cex/h.  Since  we  also  assume  that  0  be¬ 
longs  to  Dom(P*)  C  L2(M),  we  must  have  c  =  0,  so  that  0  is  the  zero 
element  of  L2(M). 

We  have  shown,  then,  that  only  0  belongs  to  the  kernel  of  P*  +  il .  A 
similar  argument  with  i  replaced  by  —i  and  ex/n  by  e~x!h  shows  that  only 
0  belongs  to  the  kernel  of  P*  —  il .  Thus,  by  Corollary  9.22,  P  is  essentially 
self-adjoint.  ■ 


9.8  The  Basic  Operators  of  Quantum  Mechanics 


In  this  section,  we  consider  several  of  the  unbounded  self-adjoint  operators 
that  arise  in  quantum  mechanics.  We  find  natural  domains  of  self-  ad¬ 
jointness  for  the  position,  momentum,  kinetic  energy,  and  potential  energy 
operators.  Since  Schrodinger  operators  are  more  complicated  to  analyze, 
we  postpone  a  discussion  of  them  until  the  next  section.  We  begin  with  the 
potential  energy  operator. 


Proposition  9.30  Suppose  V  :  Mn  R  is  a  measurable  function.  Let 
V  (X)  be  the  unbounded  operator  with  domain 

Dom(V(X))  =  pe  L2(R”)  |  C(x)V>(x)  G  L2(Rn)  } 


and  given  by 


y(xM(X) 


C(x)^(x). 


Then  Dom(Vr(X))  is  dense  in  L2(M")  and  V(X)  is  self-adjoint  on  this 
domain. 


Proof.  Define  a  subset  Em  of  Mn  by 

Em  =  {x  G  Rn  ||P(x)|  <m}, 

so  that  UmPm  =  Mn.  Then  for  any  0  E  P2(Mn),  the  function  01  belongs 
to  Dom(P (X)).  On  the  other  hand,  using  dominated  convergence,  we  have 
0 1  Em  -T  0  as  m  oo,  establishing  that  Dom(P(X))  is  dense. 
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Since  V  is  real- valued,  it  is  easy  to  see  that  V (X)  is  symmetric  on 
Dom(U(X)).  Thus,  U(X)*  is  an  extension  of  D(X). 

Meanwhile,  suppose  (p  E  Dom(V(X)*),  meaning  that 


ip  i-A 


cp(x)V(x)ip(x)  dx, 


ip  E  Dom(D(X)) 


(9.15) 


is  a  bounded  linear  functional.  This  linear  functional  has  a  unique  bounded 
extension  to  L 2  and,  thus,  Thus,  there  exists  a  unique  x  £  T2(Mn)  such 


that 


ip(x)V(x)(p(x)  dx 


X{x)<t>(x)  dx , 


(9.16) 


or 


ip(x)V(x)  —  x(x)  <P(X)  dx  —  0 


J  x  L  J 

for  all  <p  E  Dom(U(X)). 

Taking  <p  =  (ipV  —  %)1  #m,  we  see  that  ipV  —  X  is  zero  almost  everywhere 
on  Em ,  for  all  m,  hence  zero  almost  everywhere  on  Mn.  Thus,  ipV  is  equal 
to  x  as  an  element  of  L2(Mn).  This  shows  that  ip  E  Dom(U(X)).  Thus, 
actually,  Dom(V(X)*)  =  Dom(U(X)).  Since  we  have  already  shown  that 
V(X)*  is  an  extension  of  U(X),  we  conclude  that  U(X)  is  self-adjoint  on 
Dom(U(X)).  ■ 

If  we  specialize  the  preceding  proposition  to  the  case  V (x)  =  Xj ,  we 
obtain  the  following  result  about  the  position  operator. 


Corollary  9.31  The  position  operator  Xj  is  self-adjoint  on  the  domain 


Dom(X;)  =  {'ip  e  L2(Rn) 


Xjip(x)  e  L2(R”)}  . 


We  now  turn  to  consideration  of  the  momentum  operator.  Since  the 
Fourier  transform  converts  d/dxj  into  multiplication  by  ikj  (Proposition 
A.  17)  we  can  use  the  preceding  results  on  multiplication  operators  to  obtain 
a  natural  domain  on  which  the  momentum  operator  is  self-adjoint. 


Proposition  9.32  For  each  j  —  1,  2, . . . ,  n,  define  a  doma'in  Dom(Tj)  CZ 
L2(Mn)  as  follows: 


Dom (Pj)  =  U,  G  L2(Rn)  fcX(k)  G  L2(Mn)  | , 
where  fj  is  the  Fourier  transform  of  ip.  Define  Pj  on  this  domain  by 


Pjip  =  F-\hkjip(  k)). 


Then  Pj  is  self-adjoint  on  Dom  (Pj). 

The  domain  Dom  (Pj)  of  Pj  can  also  be  described  as  the  set  of  all  ip  E 
L2(Mn)  such  that  dip/dxj,  computed  in  the  distribution  sense,  belongs  to 
L2(Mn).  For  any  ip  E  Dom(Pj  ),  we  have  Pjip  =  —ihdip/dxj,  where  dip/dxj 
is  computed  in  the  distribution  sense. 
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Saying  that  the  distributional  derivative  of  pj  belongs  to  L2(Rn)  means 
(Proposition  A. 29)  that  there  exists  a  (unique)  <p  in  P2(Mn)  such  that 


dx 

dxj 


,i>)  =  (x,<t>) 


for  all  x  G  6(7°  (Mn).  If  pj  is  continuously  differentiable,  then  the  distribu¬ 
tional  derivative  of  pj  coincides  with  the  ordinary  derivative  of  pj.  Thus,  if 
pj  G  L2(Mn)  is  continuously  differentiable,  then  pj  belongs  to  Dom(Pj)  if 
and  only  if  dpj/dxj ,  computed  in  the  pointwise  sense,  belongs  to  L2(Mn), 
in  which  case  Pjip  =  —ihdip/dxj.  On  the  other  hand,  if  ip  G  Dom(Pj),  it  is 
not  necessarily  the  case  that  pj  is  continuously  differentiable. 

In  the  case  n  —  1,  the  domain  of  Pi  certainly  contains  Cf°  (JR) ,  since  each 

/s 

element  pj  of  C(P(M)  is  a  Schwartz  function  (Definition  A. 15),  so  that  pj 
is  also  a  Schwartz  function,  in  which  case  kpj(k)  belongs  to  P2(M).  Now, 
as  shown  in  Sect.  9.7,  the  operator  —ihd/dx  is  essentially  self-adjoint  on 
C£°(R),  which  means  that  this  operator  has  a  unique  self-adjoint  extension. 
This  self-adjoint  extension  must,  therefore,  agree  with  the  operator  Pi  in 
the  n  =  1  case  of  Proposition  9.32. 

Lemma  9.33  Suppose  pj  G  P2(Mn)  has  the  property  that  dpj/dxj ,  com- 
puted  in  the  distribution  sense,  is  equal  to  an  L2  function  <p.  Then  <p( k)  = 

A  A 

ikjpjpk),  showing  that  kjpjfk)  belongs  to  L2(Mn). 

/s 

Conversely,  suppose  pj  G  L2(Rn )  has  the  property  that  kjpj( k)  belongs  to 
P2(Mn).  Then  dpj/dxj,  computed  in  the  distribution  sense,  is  equal  to  the 
L 2  function  Jr~1{ikjJr(gp)) . 

Proof.  Suppose  dpj/dxj ,  computed  in  the  distribution  sense,  is  equal  to  the 
L 2  function  <fi  (see  Definition  A. 28).  Then  by  the  unitarity  of  the  Fourier 
transform  (Theorem  A.  19)  and  its  behavior  with  respect  to  differentiation 
(Proposition  A.  17),  we  have 


(x,  4>)  =  - 


dx 

dx, 


=  -(¥W/W), 


for  all  x  £  C£°(M).  Thus 


=  -  (ikjTix),^)) ,  X  €  C™(R) 
Writing  this  equality  out  as  an  integral,  we  have 


X(k)0(k)  dk  =  -  ikjx(k)ip( k)  dk 


Xpk)ikjpj{)T)  dk 


(9.17) 


for  all  x  £  Cf°(Rn). 
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We  now  claim  that  because  (9.17)  holds  for  all  x  £  Cf°(Rn),  we  must 
have  0(k)  =  ikjip( k)  for  almost  every  k.  Using  the  Stone- Weierstrass  the¬ 
orem  and  Theorem  A.  10,  it  is  not  hard  to  show  that  the  space  of  smooth 
functions  with  support  in  [a,  b]  is  dense  in  L2([a,6]),  for  all  a  <  b  E  R. 

/S.  A 

Since  both  <p  and  kjip( k)  are  locally  square-integrable,  we  see  that  these 
two  functions  are  equal  almost  everywhere  on  [a,  6],  for  all  a  <  b  E  R,  and 
hence  equal  almost  everywhere  on  R. 

/\  A 

Since  <p  is  globally  square-integrable,  so  is  kjip{ k).  Furthermore,  by  the 
injectivity  of  the  L2  Fourier  transform,  we  have 


dip 

dxj 


=  (f,  = 


as  claimed. 

The  argument  for  the  second  part  of  the  lemma  is  similar  and  left  as  an 
exercise  (Exercise  12).  ■ 

Proof  of  Proposition  9.32.  By  Proposition  9.30,  the  operator  of  mul¬ 
tiplication  by  kj  is  an  unbounded  self-adjoint  operator  on  L2(Mn),  with 
domain  equal  to  the  set  of  (p  for  which  kj<p( k)  belongs  to  L2(Mn).  It  then 
follows  from  the  unitarity  of  the  Fourier  transform  that  P3  =  KT~X is 
self-adjoint  on  F~l (Dom(M/Cj. )),  where  denotes  multiplication  by  kj. 
The  second  characterization  of  Dom(Pj)  follows  from  Lemma  9.33.  ■ 

Proposition  9.34  Define  a  domain  Dom(A)  as  follows: 


Dom(A) 


■0  G  L2{ Rn)  |k|2  V>(k)  G  L 


Define  A  on  this  domain  by  the  expression 


^1(|k|U(k)), 


(9.18) 


■  1 

where  is  the  Fourier  transform  of  and  T~  is  the  inverse  Fourier. 
Then  A  is  self-adjoint  on  Dom(A). 

The  domain  Dom(A)  may  also  be  described  as  the  set  of  all  ip  E  L2(Mn) 
such  that  A  ip,  computed  in  the  distribution  sense,  belongs  to  L2(Mn).  If 
ip  E  Dom(A),  then  A  ip  as  defined  by  (9.18)  agrees  with  A  ip  computed  in 
the  distribution  sense. 


The  proof  of  Proposition  9.34  is  extremely  similar  to  that  of  Proposi¬ 
tion  9.32  and  is  omitted.  Of  course,  the  kinetic  energy  operator  —h2  A/ (2m) 
is  also  self-adjoint  on  the  same  domain  as  A.  It  is  easy  to  see  from  (9.18) 
and  the  unitarity  of  the  Fourier  transform  that  —h2  A/ (2m)  is  non-negative, 
that  is,  that 

—Ap)  >  0 

2  to  V  “ 


for  all  ip  E  Dom(A). 
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Using  the  same  reasoning  as  in  Sects.  9.6  and  9.7,  it  is  not  hard  to  show 
that  the  operators  P3  and  A  are  essentially  self-adjoint  on  Cf°(Rn).  See 
Exercise  16. 

Care  must  be  exercised  in  applying  Proposition  9.34.  Although  the  func¬ 
tion 

V’(x)  := 


is  harmonic  on  R3\{0},  the  Laplacian  over  R3  of  ijj  in  the  distribution 
sense  is  not  zero  (Exercise  13).  (It  can  be  shown,  by  carefully  analyzing  the 
calculation  in  the  proof  of  Proposition  9.35,  that  A  ip  is  a  nonzero  multiple 
of  a  ^-function.)  This  example  shows  that  if  a  function  ip  has  a  singularity, 
calculating  the  Laplacian  of  ip  away  from  the  singularity  may  not  give  the 
correct  distributional  Laplacian  of  ip.  For  example,  the  function  <p  in  L2(M3) 
given  by 


(9.19) 


is  not  in  Dom(A),  even  though  both  cp  and  A <p  are  (by  direct  computa¬ 
tion)  square-integrable  over  M3\{0}.  Indeed,  when  n  <  3,  every  element  of 
Dom(A)  is  continuous  (Exercise  14). 


Proposition  9.35  Suppose  ^(x)  =  g(x)/(|x|),  where  g  is  a  smooth  func¬ 
tion  on  Mn  and  f  is  a  smooth  function  on  (0,  oo).  Suppose  also  that  f 
satisfies 


lim  rn  1 

r— 


f(r)  =  o 


lim  r"  1 

r— 


f(r)  =  0. 


If  both  ip  and  A  ip  are  square-integrable  over  Mn\{0};  then  ip  belongs  to 
Dom(A). 


Note  that  the  second  condition  in  the  proposition  fails  if  n  =  3  and 
f(r)  =  1/r.  We  will  make  use  of  this  result  in  Chap.  18. 

Proof.  To  apply  Proposition  9.34,  we  need  to  compute  (^,  Ay),  for  each 
X  G  Cf°(Rn).  We  choose  a  large  cube  C,  centered  at  the  origin  and  such 
that  the  support  of  y  is  contained  in  the  interior  of  C.  Then  we  consider 
the  integral  of  ip{d2 x/dx2)  over  C\C£,  where  C£  is  a  cube  centered  at  the 
origin  and  having  side-length  e.  We  evaluate  the  Xj -integral  first  and  we 
integrate  by  parts  twice.  For  “good”  values  of  the  remaining  variables,  ar¬ 
ranges  over  all  of  (7,  in  which  case  there  are  no  boundary  terms  to  worry 
about.  For  “bad”  values  of  the  remaining  variables,  we  get  two  kinds  of 
boundary  terms,  one  involving  ip(dx/dxj)  and  one  involving  (dip/dxj) y, 
in  both  cases  integrated  over  two  opposite  faces  of  C£. 

Now, 


dip 

dxj 


dg 

dxj 


/(|x|)  +p(x) 


df  Xj 
dr  r 
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Since  the  area  of  the  faces  of  the  cube  is  £n_1,  the  assumption  on  /  will 
cause  the  boundary  terms  to  disappear  in  the  limit  as  €  tends  to  zero. 
Furthermore,  both  ip  and  A  ip  are  in  L2(Mn)  and  thus  in  L1(C),  where  in 
the  case  of  A ip,  we  simply  leave  the  value  at  the  origin  (which  is  a  set  of 
measure  zero)  undefined.  Thus,  integrals  of  ^Ay  and  (A pj)x  over  C\C£ 
will  converge  to  integrals  over  C.  Since  the  boundary  terms  vanish  in  the 
limit,  we  are  left  with 

(V  A%)  =  (A ip,x)  ■ 

Thus,  the  distributional  Laplacian  of  ip  is  simply  integration  against  the 
“pointwise”  Laplacian,  ignoring  the  origin.  Proposition  9.34  then  tells  us 
that  ip  G  Dom(A).  ■ 


9.9  Sums  of  Self-Adjoint  Operators 


In  the  previous  section,  we  have  succeeded  in  defining  the  Laplacian  A, 
and  hence  also  the  kinetic  energy  operator  —  ft2A/(2m),  as  a  self-adjoint 
operator  on  a  natural  dense  domain  in  L2(Mn).  We  have  also  defined  the 
potential  energy  operator  V (X)  as  a  self-adjoint  operator  on  a  different 
dense  domain,  for  any  measurable  function  V  :  Mn  -A  R.  To  obtain  the 
Schrodinger  operator  —h2 A/ (2m)  +  U(X),  we  “merely”  have  to  make  sense 
of  the  sum  of  two  unbounded  self-adjoint  operators.  This  task,  however, 
turns  out  to  be  more  difficult  than  might  be  expected.  In  particular,  if 
V  is  a  highly  singular  function,  then  —h2A/ (2m)  +  U(X)  may  fail  to  be 
self-adjoint  or  essentially  self-adjoint  on  any  natural  domain. 

Definition  9.36  If  A  and  B  are  unbounded  operators  on  H,  then  A  +  B 
is  the  operator  with  domain 


Dom(A  +  B)  :=  Dom(A)  D  Dom(L>) 


and  given  by  (A  +  B)ip  =  Aip  +  Bip. 


The  sum  of  two  unbounded  self-adjoint  operators  A  and  B  may  fail  to  be 
self-adjoint  or  even  essentially  self-adjoint.  [If,  however,  B  is  bounded  with 
Dom(B)  =  H,  then  Proposition  9.13  shows  that  A  +  B  is  self-adjoint  on 
Dom(A)  n  Dom(L>)  =  Dom(A).]  For  one  thing,  if  A  and  B  are  unbounded, 
then  Dom(A)  D  Dom(F>)  may  fail  to  be  dense  in  H.  But  even  if  Dom(A)  D 
Dom(F>)  is  dense  in  H,  it  can  easily  happen  that  A  +  B  is  not  essentially 
self-adjoint  on  this  domain.  (See,  for  example,  Sect.  9.10.)  Many  things  that 
are  simple  for  bounded  self-adjoint  operators  becomes  complicated  when 
dealing  with  unbounded  self-adjoint  operators! 

In  this  section,  we  examine  criteria  on  a  function  V  under  which  the 
Schrodinger  operator 


h2 


A  +  U 


H 


2m 
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is  self-adjoint  or  essentially  self-adjoint  on  some  natural  domain  inside 
L2(Mn). 

Theorem  9.37  (Kato— Rellich  Theorem)  Suppose  that  A  and  B  are 

unbounded  self-adjoint  operators  on  H.  Suppose  that  Dom(A)  C  Dom (B) 
and  that  there  exist  positive  constants  a  and  b  with  a  <  1  such  that 


\m\  <  a  \\Aif\\  +  b  \\if 


(9.20) 


for  all  if  G  Dom(A).  Then  A-\-B  is  self-adjoint  on  Dom(A)  and  essentially 
self-adjoint  on  any  subspace  of  Dom(A)  on  which  A  is  essentially  self- 
adjoint.  Furthermore,  if  A  is  non-negative,  then  the  spectrum  of  A  +  B  is 
bounded  below  by  — 6/(  1  —  a). 

Note  that  since  we  assume  Dom (B)  D  Dom(A),  the  natural  domain  for 
A  +  B  is  Dom(A)  D  Dom(L>)  =  Dom(A).  An  operator  B  satisfying  (9.20) 
is  said  to  be  relatively  bounded  with  respect  to  A ,  with  relative  bound  a. 
Proof.  We  use  the  trivial  variant  of  Theorem  9.21  given  in  Exercise  8. 
Choose  a  positive  real  number  p  large  enough  that  a  +  b/ p  <  1,  which  is 
possible  because  we  assume  a  <  1.  Then  for  any  if  G  Dom  (A),  we  have 


(A  +  B  +  ipl)if  =  (B(A  +  ipl )_1  +  /)  (A  +  ipl)if. 

For  any  if  G  H,  we  compute  that 

\B{A -\- ipl)~lrif  <  a  ||  A(A  +  ipl)~xif  +  b  1 1 (A  +  ipl)~lrf 

'  b\  . 

if 


(9.21) 


(9.22) 


Here  we  have  made  use  of  the  estimates 


A(A  -\-  ipl)  1,  II  (A  -f-  ipl) 


1 

< 

p 


both  of  which  are  elementary  (Exercise  17). 

If  C  denotes  the  operator  B(A  +  ipl )-1,  (9.22)  tells  us  that  ||C||  < 
(a  +  b/p)  <  1.  Thus,  by  Lemma  7.6,  C  + 1  is  invertible.  Furthermore,  since 
A  is  self-adjoint,  A-\-ipI  maps  Dom(A)  onto  H.  Thus,  (9.21)  tells  us  that 
A  +  B  +  ipl  also  maps  Dom(A)  onto  H.  The  same  argument  shows  that 
A  +  B  —  ipl  maps  Dom(A)  onto  H  and  we  conclude,  by  Exercise  8,  that 
A  +  B  is  self-adjoint  on  Dom(A). 

Suppose,  in  addition,  that  A  is  non-negative.  Let  us  replace  ip  by  A  >  0, 
in  (9.21).  Calculating  as  in  (9.22),  using  the  estimates  in  Exercise  18,  we 
obtain  that 

b 


B(A  +  \I)~  if 


< 


a+\ 


ib 


for  all  ip  e  H.  If  A  A  6/(1  —  a ),  then  a  T  6/A  <  i,  and  by  the  above 
argument,  Range  (A  +  B  +  A  I)  =  H.  Furthermore,  since  A  +  B  +  XI  is  self- 
adjoint,  Proposition  9.12  tells  us  that  ker(A  +  B  +  XI)  =  {0}.  This  shows 
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that  A  +  B  +  XI  is  invertible  and  —A  is  in  the  resolvent  set  of  A  +  B.  We 
conclude,  then,  that  the  spectrum  of  A-\-B  is  contained  in  [—b/ (1  — a),  +oo). 

The  last  part  of  the  theorem,  concerning  essential  self-adjointness,  is  left 
as  an  exercise  (Exercise  19).  ■ 

Theorem  9.38  Suppose  n  is  at  most  3  and  V  :  Rn  -A  R  is  a  measur¬ 
able  function  that  can  be  decomposed  as  a  sum  of  two  real-valued,  mea¬ 
surable  functions  V\  and  V2,  with  V\  belonging  to  L2(Mn)  and  V2  being 
bounded.  Then  the  Schrodinger  operator  —  h2A/(2m)  +  V(X)  is  self-adjoint 
on  Dom(A).  Furthermore,  —  h2A/(2m)  +  U(X)  is  bounded  below. 

Implicit  in  the  statement  of  the  theorem  is  that  Dom(U(X)),  as  given 
in  Proposition  9.30,  contains  Dom(A).  A  result  similar  to  Theorem  9.38  in 
Mn,  n  >  4,  but  the  condition  that  Vj  belongs  to  L2(Mn)  is  replaced  by  the 
condition  that  V\  belongs  to  Lp(Rn )  for  some  p  >  nj 2.  See  Theorem  X.20 
in  Volume  II  of  [34]. 

Proof.  We  apply  the  Kato-Rellich  theorem  with  A  =  —h2  A /2m  and  B  = 
V(X).  Assume  if  E  Dom(A)  and  fix  some  e  >  0.  By  Exercise  14,  there 
exists  a  constant  ce  such  that 


^(x)|  <  £  ||  Alp\\  +  C£  \\lp 

for  all  x  E  Mn.  Thus,  if  V  is  as  in  the  theorem  and  ip  E  Dom(A), 

||V^||  <  sup  |^(x)|  || Vi ||  +  sup  | V2 (x) |  || -0H 

<  £  ||Vi||  || A^||  +  (c£  || Vi ||  +  sup  | V2 (x) | )  ||^ 


This  shows  that  Dom(U(X))  D  Dom(A).  Since  e  is  arbitrary,  we  can 
arrange  for  the  constant  in  front  of  ||  A^0 1|  to  be  less  than  one  and  the 
Kato-Rellich  theorem  applies.  ■ 


Theorem  9.39  Suppose  n  is  at  most  3  and  V  :  Mn  R  is  a  measur¬ 
able  function  that  can  be  decomposed  as  a  sum  of  three  real-valued,  mea¬ 
surable  functions  Vj,  V2,  and  V3,  with  Vj  belonging  to  L2{Rn),  V2  being 
bounded,  and  V3  being  non-negative  and  locally  square-integrable.  Then 
the  Schrodinger  operator  —h2 A/ (2m)  -f-U(X)  is  essentially  self-adjoint  on 
Cc°°(Mn). 

The  proof  of  this  result  would  take  us  too  far  afield  and  is  omitted.  See 
Theorem  X.29  in  Volume  II  of  [34].  Note  that  we  assume  only  that  V3  is 
non-negative  and  locally  square-integrable;  V3  can  tend  to  +00  arbitrarily 
fast  at  infinity.  Again,  the  same  result  applies  in  Mn,  n  >  4,  if  the  condition 
on  V\  is  replaced  by  the  assumption  that  V\  E  ZX(Mn)  for  some  p  >  nj 2. 


Proposition  9.40  Fix  a  and  b  in  Mn  and  let  a  •  X  +  b  •  P  denote  the 
operator  given  by 


(a  •  X  +  b  •  P)^(x) 


n 

(a  •  x)^(x)  —  ih  bj 

3  = 1 


dip 

dxj' 
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Then  a  •  X  +  b  •  P  is  essentially  self-adjoint  on  Cp°(Mn). 

Proof.  We  use  the  same  strategy  as  in  Sect.  9.7,  namely  we  explicitly 
solve  the  equation  A* if  =  Piif  and  find  that  there  are  no  nonzero,  square- 
integrable  solutions. 

The  case  b  =  0  is  not  hard  to  analyze  and  is  left  as  an  exercise  (Ex¬ 
ercise  20).  Assume,  then,  that  b  /  0.  By  making  a  rotational  change  of 
variables,  we  can  assume  that  b  =  ae\  and  a  =  f3e i  +  ye2,  so  that 

dib 

(A0)(x)  =  ((3x  i  4-7x2  )^(x)  —  iha— — .  (9.23) 

(If  n  =  1,  the  7x2  term  is  not  present.)  As  in  the  proof  of  Proposition  9.29, 
the  adjoint  A *  of  A  will  be  given  by  the  same  formula  as  A ,  with  Dom(A*) 
consisting  of  those  elements  if  of  L2(Mn)  for  which  the  right-hand  side  of 
(9.23),  computed  in  the  distributional  sense,  belongs  to  L2(Mn). 

We  now  apply  the  criterion  for  essential  self-adjointness  in  Corollary  9.22. 
We  need  to  show  that  the  equations  A* if  =  iif  and  A* if  =  —iif  have  no 
nonzero  solutions  in  Dom(A*).  After  rewriting  the  equation  A* if  =  iif  as 

Hr =  “  k{lix' + mMx)  ~h*{x)’  <9-24) 

we  can  easily  find  the  general  distributional  solution  as 

1 

X\X2 - ~X\ 

an 

[It  is  easily  verified  that  if  we  let  <f  equal  if  divided  by  the  exponential  on  the 
right-hand  side  of  (9.25),  then  <f  satisfies  dcf/dx  1  =  0  in  the  distributional 
sense.  Exercise  21  then  tells  us  that  <f  must  be  a  function  of  X2, . . .  ,  xn. 
Since  the  exponential  factor  is  never  square  integrable  as  a  function  of  x\ 
with  X2  fixed,  the  only  way  that  if  can  be  square  integrable  is  if  c  is  zero 
for  almost  every  value  of  (#2, . . . ,  xn),  in  which  case  if  is  the  zero  element 
of  L2(Mn).  A  similar  argument  shows  that  the  equation  A* if  =  —iif  has  no 
nonzero  solutions.  ■ 


if(x.)  =  c(x2, . . . ,  xn)  exp 


i/3 

2  ah 


x- 


27 

ah 
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^  _ j-. 

In  this  section,  we  will  show  that  the  Schrodinger  operator  H  =  Pz  /  (2 to)  — 
X4  is  not  essentially  self-adjoint  on  Cf°  (M) ,  even  though  H  is  certainly 
symmetric.  By  contrast,  P2 /(2m)  +  X4  is  essentially  self-adjoint,  by  The¬ 
orem  9.39.  The  operator  P2 /{2m)  —  X4  is  a  more  serious  counterexample 
than  the  one  in  Sect.  12.2,  in  that  it  does  not  involve  any  obviously  in¬ 
correct  choice  of  boundary  conditions.  On  the  other  hand,  it  should  not 
be  surprising  that  something  goes  “wrong”  in  a  quantum  system  with  a 
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potential  equal  to  —  x4.  After  all,  a  classical  system  with  this  potential  has 

trajectories  that  go  to  infinity  in  finite  time  (see  Exercise  4  in  Chap.  2). 

/\ 

To  show  that  H  is  not  essentially  self-adjoint,  we  will  show  that  the 

/\ 

adjoint  H *  is  not  symmetric.  Suppose  ^  is  a  C°°  function  such  that  both 
ip  and  the  function 

— —  ip"  (x)  —  x4tp(x)  (9.26) 

2m 

belong  to  L2(M).  Using  integration  by  parts,  as  in  the  proof  of  Lemma  9.28, 
we  can  see  that  ip  is  in  the  domain  of  H*  and  H*ip  is  the  function  in  (9.26). 

We  will  construct  an  approximate  eigenvector  ip  E  Dom(iL*)  for  H *  with 

/\ 

an  imaginary  eigenvalue  ia,  which  will  show  that  H *  is  not  symmetric  and 
thus  H  is  not  essentially  self-adjoint. 

Theorem  9.41  Define  an  operator  H  with  Dom(lL)  =  Cp°  (M)  by  the  for¬ 
mula 

h2  d2 


H  = 


2m  dx2 


—  x 


Then  H  is  not  essentially  self-adjoint. 

In  preparation  for  the  proof,  let  us  define  a  function  p{x)  on  R  such  that 

p(x)2  4 

—  x  =  ta , 


2m 


that  is. 


p(x)  =  V2  m\/  xA  +  lot. 


(9.27) 


Here  we  take  the  square  root  that  is  in  the  first  quadrant.  The  function 
p(x)  represents  “the  momentum  of  a  classical  particle  with  energy  ia .” 

Lemma  9.42  If  ipa  is  given  by 


‘X 


ipa(x)  =  — exp  Jo  p(v )  dv  1 » 


(9.28) 


then  belongs  to  L2(R)  and  the  function 


h2  d2ipa 


2m  dx2 


- 


(9.29) 


also  belongs  to  L2(M).  Furthermore,  we  have 


h2  d2 


2m  dx2 


—  x 


lOL 


lpa(x) 


h2 

2m 


'Pa(x)ma(x), 


where 


ma(x)  =  ~ 


X 


6 


-3 


or 


4  (x4  +  ia)2  (x4  +  ia) 
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It  will  be  apparent  from  the  proof  that  the  two  terms  in  (9.29)  are  not 
separately  in  L2(M).  The  motivation  for  the  definition  of  comes  from 
the  WKB  approximation  (Chap.  15)  with  a  complex  value  for  the  energy. 
Proof.  Let  us  consider  the  integral  of  p, 


'X 


* X 


p(y)  dy  =  \pl m  /  \/ yA  +  ia  dy. 

'o  J  o 

Using  the  power  series  for  (1  +  x)a  we  see  that  for  large  y, 
\JyA  +  ia  =  y 2  yjl  +  ia/yA  =  y2  (l  +  +  O  ^  1 


2  yl 


y 


8 


From  this  estimate,  it  is  easy  to  see  that  the  imaginary  part  of  fQ  p(y)  dy 
remains  bounded  as  x  tends  to  Too.  It  follows  that  the  exponential  in  the 
definition  of  i/j  is  bounded,  from  which  it  is  easy  to  see  that  i/j  is  square 
integrable. 

Now,  using  the  formula  for  the  second  derivative  of  a  product,  we  obtain 


d2 

h2—^a  = 


■ft 


dx2 

d2 


p(x)‘ 


ih 


p\x) 


Vp(x)  VpIx) 


2  r  — 


1  p'(x)  \  ip(x) 
2p(x)3/2)  h 


1 


dx 2  y/p(x) 


• X 


exp  <;  -  j  p(y)  dy 


(9.30) 


The  factor  of  1/  \/p(x)  in  the  definition  of  ipa  was  chosen  precisely  so  that 
the  second  and  third  terms  in  square  brackets  will  cancel.  If  we  replace 
p2(x)  in  the  numerator  of  the  first  term  by  2 m(xA  +  ia),  we  obtain 


h2 

2rn 


i ’a(x )  -  X4l/ja  ~  iaipa 


h2  (  d2 


2m  V  dx2 


p(x) 


-1/2 


>x 


exp  S  ^  j  p{y )  dy  J-  . 


It  is  then  an  elementary  calculation  to  show  that 


dx2 


p(x)  1^2  =  p(x)  1^2 


-12 


-(or -Ma)  zx°  —  3(x4  +  ia)  ~ x 


from  which  the  lemma  follows.  ■ 

Proof  of  Theorem  9.41.  If  H  were  essentially  self-adjoint,  H*  (which 
would  coincide  with  Hcl )  would  be  self-adjoint  and,  in  particular,  symmetric. 
If  this  were  the  case,  we  would  have,  by  the  proof  of  Lemma  7.8, 

((H*  -  ial)ip,  (H*  -  ial)ip\  >  a2  (ip,  ip)  (9.31) 

for  all  ip  G  Dom(iL*)  and  a  G  R.  But  if  'ipa  is  the  function  in  Lemma  9.42, 

/\ 

the  discussion  preceding  Theorem  9.41  shows  that  belongs  to  Dom(iL*). 
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Furthermore,  it  is  easily  verified  that  there  is  a  constant  C  such  that 
ma(x) |  <  C  for  all  a  >  1  and  x  E  R.  Thus,  for  all  sufficiently  large 
a,  we  have 


(H* 


iotI)lpa 


<  a2  HV'a 


2 

1 


contradicting  (9.31).  ■ 

A 

See  Exercise  22  for  a  more  explicit  approach  to  showing  that  H*  is  not 
symmetric. 


9.11  Exercises 


1.  Show  that  an  unbounded  operator  A  fails  to  be  closable  if  and  only 
if  the  closure  of  the  graph  of  A  contains  an  element  of  the  form  (0,  ip) 
with  ip  ^  0. 


2. 


Define  an  unbounded  operator  A  on  L2([ 0, 1])  with  domain  Dom(A)  = 
<?([<),  1])  by 

Af  =  mi, 


where  1  is  the  constant  function.  Show  that  A  is  not  closable. 


3.  Prove  Proposition  9.13. 

4.  Suppose  that  A  is  an  unbounded  self-adjoint  operator  on  H  and  that 
numbers  An  in  a  (A)  converge  to  some  A  E  R.  Using  Point  1  of  Propo¬ 
sition  9.18,  show  that  A  E  cr(A). 

5.  Suppose  A  is  a  closed  operator  on  H.  Show  that  the  kernel  of  A  is  a 
closed  subspace  of  H. 


6. 


Suppose  A  is  a  closed  operator  on  H.  Define  a  norm 

by 


ip 


p)\\  +  \\Aip\ 


on  Dom(A) 


Show  that  Dom(A)  is  a  Banach  space  with  respect  to  |  • 


7.  Let  A  be  an  unbounded  operator  on  H. 


(a)  Show  that  if  A  is  symmetric,  then  Acl  is  also  symmetric. 

(b)  Show  that  if  B  is  an  extension  of  A,  then  A*  is  an  extension  of 
BA 

(c)  Suppose  A  is  self-adjoint  and  B  is  an  extension  of  A.  Show  that 
if  B  is  symmetric,  then  Dom(A)  =  Dom (B).  (That  is  to  say,  a 
self-adjoint  operator  has  no  proper  symmetric  extensions.) 
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8.  Fix  a  positive  real  number  fi. 

(a)  Show  that  a  symmetric  operator  A  is  self-adjoint  if  and  only  if 
Range  (A  +  ijul)  and  Range  (A  —  ifi  I)  are  equal  to  H. 

(b)  Show  that  a  symmetric  operator  A  is  essentially  self-adjoint  if 
and  only  if  Range  (A  A  i/il)  and  Range  (A  —  i/j,I)  are  dense  in  H. 

9.  Let  A  be  the  operator  considered  in  Sect.  9.6.  Using  Lemma  9.28, 
show  that  for  each  A  E  C,  there  exists  if  E  Dom(yU)  with  A =  A^. 
Conclude  that  each  A  E  C  belongs  to  the  spectrum  of  Acl . 

Hint :  Recall  that  ( Acl )*  =  A* . 

10.  Let  A  be  the  operator  considered  in  Sect.  9.6  and  suppose  is  in  the 
domain  of  Acl .  Then  there  exists  a  sequence  ipn  in  Dom(yf)  such  that 
ipn  converges  to  if  in  L2([0, 1])  and  such  that  Aifn  converges  to  some 
x  in  L2([0, 1]). 

(a)  Show  that 


(l[0,cc]  i  ^Pn) 


for  all  x  E  [0, 1]. 

(b)  Show  that  ifn  converges  uniformly  to  the  function 

-Ip(x)  =  i( l[0vr],x)  . 


(c)  Conclude  that  if  is  continuous  and  satisfies  ^(0)  =  ip(l)  =  0. 

11.  Take  H  =  L2((0,oo))  and  let  A  be  the  operator  —i  d/dx,  with 
Dom(yf)  consisting  of  those  smooth  functions  that  are  supported  on 
a  compact  subset  of  (0,  oo).  (Such  a  function  is,  in  particular,  zero  on 
(0,  e)  for  some  e  >  0.)  Show  that  A  is  symmetric  and  that  A *  +  H  is 
injective  but  that  A *  —  H  is  not  injective. 

Hint :  Imitate  the  arguments  in  the  proof  of  Propositions  9.27  and  9.29. 


12.  Prove  the  second  part  of  Lemma  9.33. 


13. 


Let  x  be  a  smooth 

,  radial  function  on  M3  such  that  for 

X 

have  x(x)  =  1,  for 

X 

>  2  we  have  x(x)  =  0?  and  for  1  < 

X 

have  dx/dr  <  0.  Show  that 


<  1  we 

<  2,  we 


Ax(x)  dx  <  0, 


which  shows  that  the  Laplacian  of  1/ 
not  zero. 


in  the  distribution  sense,  is 
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Hint :  Let  E  =  Ci\C2,  where  C\  is  a  cube  centered  at  the  origin  with 
side  length  3  and  where  C2  is  a  cube  centered  at  the  origin  with  side 
length  1/2.  Then  E  contains  the  support  of  Ay.  Using  integration  by 
parts  on  E ,  show  that 


Ax(x)  dx 


•  Vx(x)  dx. 


14. 


Let  Dom(A)  C  L2(Mn)  denote  the  domain  of  the  Laplacian,  as  given 
in  Proposition  9.34,  and  assume  n  <  3. 


(a)  Show  that  each  ip  G  Dom(A)  is  continuous  and  that  there  exists 
constants  c\  and  c2  such  that 


^(x)!  <  Cl  ll^ll  +  c2 


9/5 


i’(k) 


for  all  ip  G  Dom(A). 

Hint :  Show  that  ip  is  in  L  by  expressing  ip  as  the  product  of 
two  L2  functions. 

(b)  Show  that  for  any  e  >  0,  there  exists  a  constant  c£  such  that 


ippxp  |  <  C£  \\ip\\  +  £  ||  A-0 


for  all  ip  G  Dom(A). 


15.  Recall  the  definitions  of  Dom(Pj)  and  Dom(A)  in  Sect.  9.8.  Let 
Dom (Pj)  be  the  set  of  all  ip  belonging  to  Dom(Pj)  such  that  Pjip 
again  belongs  to  Dom(Pj).  Show  that 


|^|  Dom(P2)  =  Dom(A). 

3= 1 


16.  Let  Qj  denote  the  restriction  to  C/°(Mn)  of  the  momentum  operator 
Pj.  Show  that  Dom(Q*)  =  Dom(Pj).  Conclude  that  Qj  is  essentially 
self-adjoint. 


IT.  Let  A  be  an  unbounded  self-adjoint  operator  on  H  and  let  /jl  be  a 
nonzero  real  number. 

(a)  Show  that  ||  (A  +  i/P)-1 
by  Theorem  9.17. 

(b)  Show  that  for  all  ip  G  H, 


<  1/  \/i\.  Note  that  (A  +  i/j,I)  1  exists, 


ip\\2  =  \\A(A  +  ifil)  1ip  +  fi2  II (A  +  ifil)  1ip 


Conclude  that  II  A(A  +  i/il)  1 


<  1. 
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18.  Let  A  be  an  unbounded  self-adjoint  operator  on  H.  Suppose  A  is 
non-negative  (Definition  9.19)  and  let  A  be  a  positive  real  number. 


(a)  Show  that  ||(AL-f-A/)  1 


<  1/A. 


(b)  Show  that  for  all  ip  E  H, 


^ ||2  >  \\A(A  +  A /)"V  +  A2  \\(A  +  A 7)'V 


Conclude  that  II  +  A/) 


-l 


<  1. 


19.  Prove  the  last  part  of  Theorem  9.37,  concerning  domains  of  essential 
self-adjointness. 

Hint :  If  A  is  self-adjoint  on  Dom(AL)  and  V  C  Dom(AL)  is  a  dense 
subspace  of  H,  then  A  is  essentially  self-adjoint  on  V  if  and  only  if 
the  closure  of  A\v  is  equal  to  A. 


20.  Let  A  be  the  operator  b-Xon  the  domain  C/°(Mn),  for  some  b  E  Mn, 


(a)  Using  the  definition  of  the  adjoint  of  an  unbounded  operator, 
show  that  Dom(AP)  consists  of  all  those  ip  in  L2(Mn)  for  which 
the  function  (b  •  x)?/(x)  again  belongs  to  L2(Mn). 

(b)  Using  Proposition  9.30,  show  that  A  is  essentially  self-adjoint. 

21.  (a)  Show  that  a  function  E  C/°(Mn)  can  be  expressed  as  0  = 

dx/dxi  for  some  x  C  C/°(Mn)  if  and  only  if  satisfies 


4>{x\,  X2 , . . . ,  xn)  dx i  =  0 


for  all  (#2,  •  •  • ,  xn). 

(b)  Fix  a  function  7  E  C£°(R)  such  that  f^°00'y(x)  dx  =  1.  Show 
that  any  0  E  C/°(Mn)  can  be  expressed  as 

0(x)  =  f(x 2, . . . ,  xn)j(xi)  + 

for  some  y  E  C/°(Mn),  where  /  is  the  element  of  C/°(Mn_1) 
given  by 


/oo 

4>{x  1,  X2, . . . ,  xn)  dx  1. 

-00 


(c)  Suppose  T  is  a  distribution  on  Mn  with  the  property  that 
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Define  a  distribution  c  on  Mn  1  by  the  formula 


c(f)  =  T(f(x 2,  •  •  • ,  a;„)7(a;i)). 


Show  that  for  all  </>  E  C^°(Rn)  we  have 

T(</>)  =  c(0), 


where  E  Cr?°  (Mn  x)  is  given  by 


0(x2, . . .  ,£n)  =  /  (/)(xi,x2, . . .  ,xn)  dxi 


/V 

22.  Let  H  denote  the  Schrodinger  operator  in  Theorem  9.41  and  let  ip 
be  the  function  defined  in  Lemma  9.42. 

(a)  Show  that 


--51  lim 

2  77T  a. — ^-oo 


A 


^cx(x)^a(x)  ~  2p'a(x)2pa(x) 


-A 


A 


-A 


(b)  Now  show  by  direct  calculation  that  ( ip,  H )  ^  (  ip). 


10 

The  Spectral  Theorem  for  Unbounded 
Self-Adjoint  Operators 


This  chapter  gives  statements  and  proofs  of  the  spectral  theorem  for 
unbounded  self-adjoint  operators,  in  the  same  forms  as  in  the  bounded 
case,  in  terms  of  project  ion- valued  measures,  in  terms  of  direct  integrals, 
and  in  terms  of  multiplication  operators.  The  proof  reduces  the  spectral 
theorem  for  an  unbounded  self-adjoint  operator  A  to  spectral  theorem  for 
the  bounded  operator  U  :=  (A  +  H)(A  —  z/)_1  (Sect.  10.4).  This  bounded 
operator  is,  however,  not  self-adjoint  but  rather  unitary.  Thus,  before  com¬ 
ing  to  the  proof  of  the  spectral  theorem  for  unbounded  self-adjoint  op¬ 
erators,  we  prove  (Sect.  10.3)  the  spectral  theorem  for  bounded  normal 
operators,  those  that  commute  with  their  adjoints.  (A  unitary  operator  U 
certainly  commutes  with  its  adjoint  U*  =  U-1.)  The  proof  for  a  bounded 
normal  operator  B  is  the  same  as  for  bounded  self-adjoint  operators,  ex¬ 
cept  for  the  step  in  which  we  approximate  continuous  functions  on  cr(B) 
by  polynomials.  Since  cr(B)  is  not  necessarily  contained  in  R,  we  need  to 
use  the  complex  version  of  the  Stone- Weierstrass  theorem,  which  requires 
us  to  consider  polynomials  in  A  and  A.  We  must  then  prove  a  strengthened 
version  of  the  spectral  mapping  theorem  before  proceeding  along  the  lines 
of  the  proof  for  bounded  self-adjoint  operators. 

In  Sect.  10.2,  we  discuss  Stone’s  theorem,  which  gives  a  one-to-one  corre¬ 
spondence  between  strongly  continuous  one-parameter  unitary  groups  and 
self-adjoint  operators.  One  direction  of  Stone’s  theorem  follows  from  the 
spectral  theorem,  that  is,  from  the  functional  calculus  that  results  from  the 
spectral  theorem. 


B.C.  Hall,  Quantum  Theory  for  Mathematicians ,  Graduate  Texts 
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10.1  Statements  of  the  Spectral  Theorem 

The  statement  of  the  spectral  theorem — in  any  of  the  forms  that  we  have 
considered — is  almost  the  same  for  unbounded  self-adjoint  operators  as  for 
bounded  ones.  The  only  difference  is  that  the  statement  of  the  theorem  in 
the  unbounded  case  has  to  contain  some  description  of  the  domain  of  the 
operator. 

Recall  that  if  /r  is  a  project  ion- valued  measure  on  (X,  Q)  with  values  in 
B( H)  and  pj  is  an  element  of  H,  then  we  can  construct  a  non- negative, 
real- valued  measure  p^  from  p  by  setting  pp( E )  =  (pj,  p(E)pj),  for  each 
measurable  set  E.  To  motivate  the  following  definition,  consider  integration 
of  a  bounded  measurable  function  /  against  a  projection- valued  measure  p. 
Since  the  integral  is  multiplicative  and  complex-conjugation  of  a  function 
corresponds  to  adjoint  of  the  operator,  we  have 


( a 1  d  *• 

=  I  l/|2  dpi^.  (10.1) 

■lx 

Suppose,  now,  that  /  is  an  unbounded  measurable  function  on  X  and  we 
wish  to  define  fxf  dp,  which  will  presumably  be  an  unbounded  operator. 
It  seems  reasonable  to  define  the  domain  of  /  to  be  the  set  of  pj  for  which 
the  right-hand  side  of  (10.1)  is  finite. 

Proposition  10.1  Suppose  p  is  a  projection- valued  measure  on  (X,  £2) 
with  values  in  B{  H)  and  f  :  X  — >  C  is  a  measurable  function  (not  nec¬ 
essarily  bounded).  Define  a  subspace  Wf  o/H  by 


f  dpi)  ip)  = 


ff  dpi)  fi 


fie  H 


[  \fW\2  dpi^(X)  <  oo 

Jx 


(10.2) 


Then  there  exists  a  unique  unbounded  operator  on  H  with  domain  Wf- 
which  is  denoted  by  fx  f  dp — with  the  property  that 

(*•  (L 

for  all  ip  inWf.  This  operator  satisfies  (10.1)  for  all  ip  G  W/. 

Note  that  since  p^p  is  a  finite  measure  for  all  ip,  if  /  is  bounded  then  the 
domain  of  fx  f  dp  is  all  of  H.  Thus,  in  the  bounded  case,  the  definition  of 
fx  f  dp  in  Proposition  10.1  agrees  with  our  earlier  definition  (in  Chap.  7) 
of  the  integral.  This  means,  in  particular,  that  if  /  is  a  bounded  function, 
fx  f  dp  is  a  bounded  operator.  Proposition  10.1  follows  immediately  from 
the  following  result. 


f  dp)  p) 


x 


/(A)  dpp{ A) 
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Proposition  10.2  Let  f  be  a  measurable  function  on  X  and  let  Wf  be  as 
in  (10.2).  Then  the  following  results  hold. 

1.  The  space  Wf  is  a  dense  subspace  of  H  and  the  map  Qf  :  Wf  —>  C 
given  by 

QfW  =  [  /(A)  oM A) 

J  X 

is  a  quadratic  form  on  Wf. 

2.  If  Lf  is  the  associated  sesquilinear  form  onWf,  we  have 


Lf{<t>A 01  <  H\\  WfWmx,^) 


(10.3) 


for  all  <j),  ip  G  Wf. 

3.  For  each  ip  G  IT/ ,  there  is  a  unique  x  £  H  such  that  Lf(<p,  ip)  =  (</>,  x) 
for  all  <p  G  Wf.  Furthermore ,  the  map  i)  i— >>  x  E  linear  and  for  all 
ip  G  Wf,  we  have 

llxll2  =  [  l/l2  dn^  (10.4) 

J  x 


Proof.  It  is  easy  to  see  that  Wf  is  closed  under  scalar  multiplication.  To 
show  that  it  is  closed  under  addition,  note  that  since  p(E)  is  self-adjoint 
and  satisfies  p(E)2  =  p(E),  we  have 

A ^+AE)  =  WlJ-(E)(4>  +  ip)\\2 

<  (ME)4>\\  +  ME)M\)2 

<2\\ii{E)4>\\2 +  2\\n(E)^\\2 
=  2/j,(f)(E)  +  t2fiq}(E'), 


where  in  the  third  line  we  have  use  the  elementary  inequality  (x  +  y )2  < 
2x 2  +  2 y2 . 

To  show  that  Wf  is  dense  in  H,  let  En  =  {x  G  X  \  \f(x)\  <  n}.  If  ip  G 
Rang e(p(En)),  then  p^(E!p)  =  0,  and,  thus, 


I/I dpp 


f  |  dpfj  <  n2pfj(En)  <  oo, 


(10.5) 


showing  that  ip  belongs  to  Wf.  Since  also  U nEn  =  X ,  the  union  of  the 
ranges  of  the  p(Eny  s  is  dense  and  contained  in  Wf. 

If  /  is  bounded,  Qf  may  be  computed  as 


Q/WO 


f  dfi)  ip 


V’gh, 
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where  fx  f  d/j,  is  as  in  Chap.  7.  Thus,  Qf  is  a  quadratic  form  for  which  the 
associated  sesquilinear  form  is 


This  form  satisfies 


(L f  d  *)  • 


^GH. 


f  dfj.)  V> 


X 


ll/l 


I LHX,^) 


(10.6) 


for  all  where  in  the  second  line  we  have  used  (10.1). 

If  /  is  unbounded  and  ip  belongs  to  Wf,  let  fn  =  flEn-  Then  Qfpip)  = 
lim n->ooQfn('lP)i  by  monotone  convergence,  in  which  case,  it  is  easy  to 
see  that  Qf  is  still  a  quadratic  form  and  that  (10.6)  still  holds  for  all 
<p  g  H.  From  (10.6),  we  see  that  for  each  tp  E  Wf,  the  conjugate-linear 
functional  (p  Lf(<p,ip)  is  bounded.  Thus,  by  (the  complex-conjugate 
of)  the  Riesz  theorem,  there  is  a  unique  vector  y  such  that  Lf(<p,ip)  = 
(<p,  y).  Furthermore,  (10.6)  tells  us  that  ||y||  <  ||/||l2(x  ^  y  Conversely, 
since  Lf(cp,  ip)  =  (<p,  y),  (10.6)  is  an  equality  when  <p  =  y,  showing  that 
||y||  >  ||/Hl2(x  n  )m  Finally,  the  map  ip  y  is  linear  because  Lf(<p,ip)  is 
linear  in  ip.  m 


Proposition  10.3  If  f  is  a  real-valued,  measurable  function  on  X ,  then 
fx  f  d/ji  is  self-adjoint  on  Wf. 

Proof.  Let  Af  =  fxf  dfi.  Define  subsets  Fn  of  X  by 


Fn  =  {x  E  X  \  n  —  1  <  \f(x)\  <  n}  , 


so  that  X  is  the  disjoint  union  of  the  Pn’s,  and  let  Wn  =  Range(/i(Fn)).  As 
in  the  proof  of  Proposition  10.2,  any  ip  E  Wn  is  in  Wf,  and  the  quadratic 
form  Qf  is  bounded  on  Wn  [compare  (10.5)].  Furthermore,  if  (p  G  (IUn)-L 
and  ip  G  IUn,  it  is  straightforward  to  check  that  =  /i^  +  /i^  and  so 

Q/(0  +  VO  =  Q/(V0  +  Q/CVO-  (10-7) 

From  (10.7),  we  obtain,  by  the  polarization  identity, 


(0,Afip}  =  Lf(4>,  ip)  =  0. 

This  shows  that  Aj'ijj  belongs  to  (IT'') —  =  Wn. 

We  conclude  that  Af  maps  fUn  boundedly  to  itself.  Indeed,  the  restric¬ 
tion  to  Wn  of  A f  coincides  with  the  restriction  to  Wn  of  the  bounded 
operator  obtained  by  integrating  flFn  with  respect  to  fi  (compare  the 
quadratic  forms).  Furthermore,  since  Qf  is  real- valued,  the  restriction  of 
Af  to  Wn  is  self-adjoint  (Proposition  A. 63). 
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Now,  H  is  the  orthogonal  direct  sum  of  the  VFn,s,  meaning  that  H  may  be 
identified  with  the  set  of  infinite  sequences  (Vt,  ip2,  ^3, . . .)  with  n  £  Wn 
and  such  that 

00 


<  00. 


n— 1 

If  An  denotes  the  restriction  of  Af  to  VFn,  then  under  this  decomposition 
of  H,  we  have 


Wf  =  <^eH 


OO 


<  OO 


n— 1 


^  =  (^1,^2,...) 


00 


^  ^  ^  d~  ||^4nt^n  ^  ^ 

n— 1 


To  verify  (10.8),  we  note  that 


OO  OO 

/  l/|2  d/V  =  V  /  |/|2  dihp  =  W  ||AnV>„ 

„=1  ^  „=1 


(10.8) 


(10.9) 


The  first  equality  is  by  monotone  convergence  and  the  second  holds  because 
l^ip  =  on  Wn .  In  particular,  the  first  quantity  in  (10.9)  is  finite  if  and 
only  if  the  last  quantity  if  finite. 

By  a  similar  argument,  for  ip  £  Wf,  we  have 


<9/ (VO  =  /  /(V  dn^X)  =  Y  {ipn,Anipn) , 

'' X  n= 1 

from  which  it  follows  that 


00 

^  ^  {Pm  ^nPn) 

n= 1 

for  all  <p,ip  £  Wf.  From  this  we  see  that  Afip  is  the  vector  represented  by 
the  sequence  (Aipi,  A21P2,  •  •  •)•  H  then  follows  from  Example  9.26  that  Af 
is  self-adjoint.  ■ 


Theorem  10.4  (Spectral  Theorem,  First  Form)  Suppose  A  is  a 
self-adjoint  operator  on  H.  Then  there  is  a  unique  projection- valued  measure 
pA  on  cr (A)  with  values  in  B{ H)  such  that 


A  dfiA{ A)  =  A. 


(10.10) 


Since  the  spectrum  of  A  is  typically  an  unbounded  set,  the  function 
/(A)  =  A  is  an  unbounded  function  on  cr(A).  Note  also  that  the  equality 
in  (10.10)  includes,  as  always,  equality  of  domains.  That  is,  the  domain  of 
the  integral  on  the  left-hand  side,  namely  the  space  Wf  in  Proposition  10.1, 
coincides  with  Dom(yl).  The  proof  of  this  theorem  is  given  in  Sect.  10.4. 
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Definition  10.5  (Functional  Calculus)  For  any  measurable  function  f 
on  a  (A ),  define  a  (possibly  unbounded)  operator ,  denoted  f(A),  by 


f(A)  =  [  /(A)  dfiA( A). 

Ja(A) 

As  usual,  we  can  extend  the  project  ion- valued  measure  pA  from  cr  (A)  to 
R  by  setting  fiA  equal  to  zero  on  the  complement  of  <j(A). 

Definition  10.6  (Spectral  Subspaces)  If  A  is  a  self-adjoint  operator 
on  H,  then  for  any  Borel  set  E  C  M,  define  the  spectral  sub  space  Ve 
of  H  by 

Ve  =  Rang  e(/aA(E)). 

Definition  10.7  (Measurement  Probabilities)  If  A  is  a  self-adjoint 
operator  on  H,  then  for  any  unit  vector  define  a  probability  measure 

pA  on  R  by  the  formula 

V$(E)  =  (V’»U(-E')V’)  • 

If  the  operator  A  represents  some  observable  in  quantum  mechanics, 
then  we  interpret  pA  to  be  the  probability  distribution  for  the  result  of 
measuring  A  in  the  state  fi. 

Proposition  10.8  Let  A  be  a  self-adjoint  operator  on  H.  Then  the  spectral 
subspaces  Ve  associated  to  A  have  the  following  properties. 

1.  If  E  is  a  bounded  subset  of  R,  then  Ve  C  Dom(A),  Ve  is  invariant 
under  A,  and  the  restriction  of  A  to  Ve  is  bounded. 

2.  If  E  is  contained  in  (Aq  —  £,  Aq  +  e),  then  for  all  £  Ve,  we  have 


||(A  —  A0/)^||  <  £  IIV^ 


Proof.  Point  1  holds  because  the  function  /(A)  =  A  is  bounded  on  E.  (See 
the  proof  of  Proposition  10.3.)  Point  2  then  holds  because,  as  in  the  proof 
of  Proposition  10.3,  the  restriction  of  A  to  Ve  coincides  with  the  restriction 
to  Ve  of  the  operator  /(A),  where  /(A)  =  Al^(A).  ■ 

Theorem  10.9  (Spectral  Theorem,  Second  Form)  Suppose  A  is  a 
self-adjoint  operator  on  H.  Then  there  is  a  a -finite  measure  fi  on  cr(A), 
a  direct  integral 

/*© 

/  Ha  d/j,(X), 

J<j(A) 

and  a  unitary  map  U  from  H  to  the  direct  integral  such  that: 


/*© 

s  G  /  Ha  d/i{ A) 
Ja(A) 


IMVIIa 


d/i( A)  <  oo 


U(T>om(A)) 
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and  such  that 

(UAU~1(s))  (A)  =  As(A) 

for  all  s  E  f7(Dom(A)). 

Theorem  10.10  (Spectral  Theorem, Multiplication  Operator  Form) 

Suppose  A  is  a  self-adjoint  operator  on  H.  Then  there  is  a  a -finite  measure 
space  (X,  fi),  a  measurable,  real-valued  function  h  on  X ,  and  a  unitary  map 
U  :  H  -E  L2{X,fi)  such  that 

7/(Dom(A))  =  {fj  E  L2(X,  /T)  {h'lp  E  L2(X,  p)  } 

and  such  that 

(U AU~X  (ij))(x)  =  h(x)fip(x) 
for  all  E  U(Dom(A)). 

These  theorems  are  also  proved  in  Sect.  10.4. 


10.2  Stone’s  Theorem  and  One-Parameter  Unitary 
Groups 

In  this  section  we  explore  the  notion  of  one-parameter  unitary  groups  and 
their  connection  to  self-adjoint  operators.  We  assume  here  the  spectral 
theorem,  the  proof  of  which  (in  Sect.  10.4)  does  not  use  any  results  from 
this  section. 


Definition  10.11  A  one-parameter  unitary  group  on  H  is  a  family 
U(t),  t  E  M,  of  unitary  operators  with  the  property  that  7/(0)  =  I  and  that 
U(s+t )  =  U(s)U(t )  for  all  s,tEl.  A  one-parameter  unitary  group  is  said 

to  be  strongly  continuous  if 


lim  || U(t)^  —  U (s)ij 

s — yt 


=  0 


(10.11) 


for  all  ip  E  H  and  all  t  E  R. 

Almost  all  one-parameter  unitary  groups  arising  in  applications  are 
strongly  continuous. 

Example  10.12  Let  H  =  L2(Rn)  and  let  Ua(t)  be  the  translation  operator 
given  by 

(Ua(t)ip)  (x)  =  t/)(x  +  ta). 

Then  U(-)  is  a  strongly  continuous  one-parameter  unitary  group. 


(10.12) 
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Proof.  It  is  easy  to  see  that  £4(*)  is  a  one-parameter  unitary  group.  To  see 
that  £4(*)  is  strongly  continuous,  consider  first  the  case  in  which  0  is 
continuous  and  compactly  supported.  Since  a  continuous  function  on  a 
compact  metric  space  is  automatically  uniformly  continuous,  it  follows  that 
0(x  +  £a)  tends  uniformly  to  0(x)  as  t  tends  to  zero.  Since  also  the  support 
of  0  is  compact  and  thus  of  finite  measure,  it  follows  that  0(x  +  ta)  tends 
to  0(x)  in  L2(Mn)  as  t  tends  to  zero. 

Now,  the  space  Cc(Mn)  of  continuous  functions  of  compact  support  is 
dense  in  L2(Mn)  (Theorem  A.  10).  Thus,  given  e  >  0  and  0  E  L2(Mn),  we 
can  find  0  E  Cc(Mn)  such  that  ||0  —  4*11  l2(r)  <  £/3.  Then  choose  S  so  that 


||£4(a)0 

have 


<  e/3  whenever  \a\  <  S.  Then  given  t  E  R,  if  \t  —  s\  <  5,  we 


||£40)0  -  E4(s)0|| 

<  \\Ua{t)jb  -  Ua{t)fx\\  A  \\Ua{t)fx - 
=  \\Ua{t)(^  -  ct>)\\  +  \\Ua(s)  (i Ua(t 


Ua(s)(j)\\  +  \\Ua(s)(j)  —  Ua(s)ilj 
-  s)(j)  -  0)11  +  II Ua(s)((j)  -  0)11  .  (10.13) 


Since  Ua(t)  and  Ua(s)  are  unitary,  we  can  see  that  each  of  the  terms  on  the 
last  line  of  (10.13)  is  less  than  e/3.  ■ 

Note  that  for  a  ^  0  the  unitary  group  Ua(-)  in  Example  10.12  is  not 
continuous  in  the  operator  norm  topology.  After  all,  given  any  e  ^  0,  we 
can  take  a  nonzero  element  0  of  L2(Mn)  that  is  supported  in  a  very  small 
ball  around  the  origin.  Then  Ua(e)i/>  is  orthogonal  to  0  and  has  the  same 
norm  as  0,  so  that 


£4000  -  £4(O)0||  =  ||E4(e)0  -  0 


Thus, 


Ua{e)  -  £4(0)||  >  \/2  for  all  e  ^  0. 


V2\\ ip 


Definition  10.13  //£/(•)  is  a  strongly  continuous  one-parameter  unitary 
group,  the  infinitesimal  generator  ofU(-)  is  the  operator  A  given  by 


A*  =  lim  I  m±z± 

t — >-0  t  t 


(10.14) 


with  Dom(A)  consisting  of  the  set  o/0  E  H  for  which  the  limit  in  ( 10.1 4) 
exists  in  the  norm  topology  on  H. 


The  following  result  shows  that  we  can  construct  a  strongly  continuous 
one-parameter  unitary  group  from  any  self-adjoint  operator  A  by  setting 
U(t)  =  elAt .  Furthermore,  the  original  operator  A  is  precisely  the  infinites¬ 
imal  generator  of  U (t). 

Proposition  10.14  Suppose  A  is  a  self-adjoint  operator  on  H  and  let  £/(•) 
be  defined  by 

U(t )  =  eitA, 

where  the  operator  eltA  is  defined  by  the  functional  calculus  for  A.  Then 
the  following  hold. 
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1.  Up)  is  a  strongly  continuous  one-parameter  unitary  group. 

2.  For  all  ip  E  Dom(AL),  we  have 


ii„,  i~ m*  -  * 

t — >-o  x  t 


1 


where  the  limit  is  in  the  norm  topology  on  H. 


3.  For  all  ip  E  H,  if  the  limit 


lim  1 m*  ~  * 

t-> o  i  t 


exists  in  the  norm  topology  on  H,  then  ip  E  Dom(AL)  and  the  limit  is 
equal  to  Aip. 


Proof.  Since  cr(A)  C  R,  the  function  /(A)  :=  eltx  is  bounded  on  cr (A)  and 
satisfies  /(A)/(A)  =  1  for  all  A  E  cr(A).  Thus,  the  operator  f(A)  is  bounded 
and  satisfies 

f(A)f(A)*  =  f(A)*f(A)  =  I, 

which  shows  that  f(A)  =  eztA  is  unitary.  The  multiplicativity  of  the  func¬ 
tional  calculus  then  tells  us  that  [/(•)  is  a  one-parameter  unitary  group.  To 
see  that  U(t)  is  strongly  continuous,  note  that 


|  U (t)iP  -  U (s)iP 


2 


(Vh  (u (ty-u (s)*)(u (t)  -  u (s))ip) 


(10.15) 


The  integral  on  the  right-hand  side  of  (10.15)  tends  to  zero  as  s  approaches 
t,  by  dominated  convergence. 

For  Point  2,  from  recall  from  Theorem  10.4  that  A  =  f^°  A  dfiA( A),  and 
take  ip  E  Dom(yf).  Then,  by  (10.4),  we  have 


1  U  ( t)ip  —  ip 
i  t 


1  eitx  -  1 
i  t 


2 


A 


(10.16) 


If  we  write  the  function  eltx  —  1  as  the  integral  of  its  derivative  with  respect 
to  A,  starting  at  A  =  0,  we  can  see  that  ( eltx  —  l)/t\  <  A.  Meanwhile, 
since  ip  is  in  the  domain  of  the  operator  A  =  fP°  A  dfiA( A),  we  have 


fZ  A2  d/aA( A)  <  oo.  Thus,  we  may  apply  dominated  convergence,  with 

4A2  as  our  dominating  function,  to  show  that  the  right-hand  side  of  (10.16) 
tends  to  zero  as  t  tends  to  zero. 
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For  Point  3,  let  B  be  the  infinitesimal  generator  of  [/(•).  If  <f  and  if  belong 
to  Dom(T>),  then 


it,  m 


lim 

t — 


lim 

t — 


lim 

t — 


/  1/7  (t)if  —  if 


/  i  u(ty  </>-(/> 

\  i  t 

/ 1  u(—t)<p  -  <p 

\i  (-t)  ’ 


Thus,  B  is  symmetric.  On  the  other  hand,  Point  2  shows  that  B  is  an 
extension  of  A ,  so  by  Exercise  7  in  Chap.  9,  B  =  A  (with  equality  of 
domain) .  ■ 


Theorem  10.15  (Stone’s  Theorem)  Suppose  U(-)  is  a  strongly  contin¬ 
uous  one-parameter  unitary  group  on  H.  Then  the  infinitesimal  generator 
A  ofU(-)  is  densely  defined  and  self-adjoint,  and  U(t )  =  eltA  for  all  t  £  R. 


If  U (•)  is  a  strongly  continuous  one-parameter  unitary  group,  then  U(-) 
is  continuous  in  the  operator  norm  topology  if  and  only  if  the  infinitesimal 
generator  of  U (•)  is  a  bounded  operator  (Exercise  1).  As  Example  10.12 
suggests,  most  one-parameter  unitary  groups  that  arise  in  applications  are 
not  continuous  in  the  operator  norm  topology. 

Before  giving  the  proof  of  Stone’s  theorem,  let  us  work  out  the  generator 
of  the  group  in  Example  10.12. 


Example  10.16  If  Ca(-),  a  G  is  the  strongly  continuous  one- 

parameter  unitary  group  in  Example  10.12,  then  each  if  E  Cf°(Rn)  is  in 
the  domain  of  the  infinitesimal  generator  A  ofUa(-)  and  for  all  such  if,  we 
have 

3  J 

Furthermore,  A  is  essentially  self-adjoint  on  Cf°(Rn). 

Proof.  The  formula  for  the  infinitesimal  generator  is  easy  to  establish  for 
if  in  (7^°(Mn).  The  essential  self-adjointness  of  A  is  a  special  case  of  Propo¬ 
sition  13.5  (the  proof  of  which  is  similar  to  the  proof  of  Proposition  9.29). 


We  now  establish  two  intermediate  results  before  coming  to  the  proof  of 
Stone’s  theorem. 


Lemma  10.17  Let  U(-)  be  a  strongly  continuous  one-parameter  unitary 
group  and  let  A  be  its  infinitesimal  generator.  If  if  E  Dom(A),  then  for  all 
t  E  M,  the  vector  U(t)if  belongs  to  Dom(A)  and 


lim 

h—>  0 


U(t  +  h)ip  -  U(t)ip 


h 


iU{t)Aij)  =  iAU(t)ip. 


(10.18) 
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Note  that  Lemma  10.17  tells  us  that  the  curve  ip(t)  :=  U(t)ipo  in  H 
satisfies  the  differential  equation 


dip 

dt 


=  iAipft ) 


in  the  natural  Hilbert  space  sense,  provided  that  ipo  belongs  to  Dom(i). 
This  result,  together  with  Proposition  10.14,  tells  us  that  if  ipo  E  Dom(iT), 

then  the  curve  ip{t)  :=  e~UH'hipo  indeed  solves  the  Schrodinger  equation 
in  the  Hilbert  space  sense. 

Proof.  We  compute  that 


U ft  +  ~  _  TT,*  [U(h)^-lP 

h  U  h 


(10.19) 


Since  ip  E  Dom(ff),  the  limit  as  h  tends  to  zero  of  (10.19)  exists  and  is 
equal  to  iU(t)Aip.  On  the  other  hand, 


U(t  +  h)iP  -  U(t)ip  _  U (h)(U (t)ip)  -  (U(t)iP) 
h  “  h  ’ 

Thus,  the  limit  as  h  tends  to  zero  of  (10.19)  is,  by  the  definition  of  A,  equal 
to  iA(U (■ t)ip ).  This  shows  that  U ( t)ip  is  in  the  domain  of  A  and  establishes 
the  second  equality  in  (10.18).  ■ 

Lemma  10.18  For  any  strongly  continuous  one-parameter  unitary  group 
U(-),  the  infinitesimal  generator  A  is  densely  defined. 

Proof.  Given  any  continuous  function  /  of  compact  support,  define  an 
operator  Bf  by  setting 


•oo 


Bf  = 


f{T)U{r)  dr. 


—  OO 


Here,  the  operator- valued  integral  is  the  unique  bounded  operator  such 
that 


•oo 


(rf’iBfip)  =  /  f(r)  ((f>,  U(t)iP)  dr. 


(10.20) 


■oo 


[It  is  easy  to  see  that  right-hand  side  of  (10.20)  defines  a  bounded  sesquilin- 
ear  form,  for  each  fixed  /  E  C£°(R).] 

Using  the  group  property  of  £/(•),  we  see  that 

/oo 

[f(r)U (t  +  t)ip  —  f(r)U{r)ip\  dr 

-oo 


■oo 

•oo 


poo 

/  [f  0  -  t)  -  f  {t)]U U)ip  dr , 


—  OO 
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where  in  the  second  line,  we  have  made  a  change  of  variable  in  the  first 
term  in  the  integral.  From  this,  we  easily  obtain  that 


lim  U ( *)Bf ^  ~  Bf  $ 
t — >-0  t 


f'(T)U{T)ip  dr. 


This  shows  that  Bfip  is  in  the  domain  of  A  for  all  ip  E  H  and  /  E  C£°(R). 

Now  choose  a  sequence  fn  E  C£°(R)  such  that  fn  is  non-negative  and 
supported  in  the  interval  [—1/n,  1/n]  and  such  that  f^°  /n(r)  dr  =  1. 
Then  for  any  ip  E  H,  we  have 


•oo 


Bfnib-ib 


fn{T)[Un{T)ip  -  ip]  dr , 


—  OO 


so  that 


/oo 

fn(r)  || U(r)ip  -  ip ||  dr 

-OO 


<  sup  ||  U(r)pj  —  ip 

—  l/n<T<l/n 


Since  Up)  is  strongly  continuous,  we  see  that  Bfn  ip  converges  to  pj  as 
n  — >•  oo.  Thus,  every  element  of  H  can  be  approximated  by  vectors  in  the 
domain  of  A.  m 

Proof  of  Theorem  10.15.  Suppose  Up)  is  a  strongly  continuous  one- 
parameter  unitary  group  and  A  is  its  infinitesimal  generator.  By  Lemma 
10.18,  A  is  densely  defined.  As  shown  in  the  proof  of  Proposition  10.14,  A 
(denoted  by  B  in  that  proof)  is  symmetric. 

Next,  we  show  that  A  is  essentially  self-adjoint.  Suppose  now  that  pj 
belongs  to  the  kernel  of  A*  —  z/,  i.e.,  A*ip  =  tip.  Given  <p  E  Dom(A), 

^|| .  On  the  other  hand,  we 


set  y(t )  =  (U(t)(/),  pj),  so  that  \y(t)\  < 
expect  that  U(t)  =  elAt ,  so  that  U(t)*  should  be  e~lA  1 .  Thus,  y(t)  should 
(formally)  be  equal  to  (0,  ePpP).  If  this  is  correct,  then  since  y(t)  is  a  bounded 
function  of  t,  we  must  have  (</>,  pj)  =  0.  Thus,  ip  would  be  orthogonal  to 
every  element  of  a  dense  subspace  of  H,  showing  that  ip  =  0.  We  could 
then  similarly  argue  that  ker(A*  +  il)  =  {0},  which  would  show  that  A  is 
essentially  self-adjoint. 

To  make  the  argument  rigorous,  we  apply  Lemma  10.17,  giving 

jt  (U{t)4>,xp)  =  {iAU(t)4>,  ip)  =  {iU(t)cp,  A*tp) 

=  {iU(t)<j>,iip)  =  (U(t)<p,ip) . 


Thus,  the  function  y(t)  :=  (U(t)<p,ip)  satisfies  the  ordinary  differential 
equation  dy / dt  =  y.  The  unique  solution  to  this  equation  is  y(t)  =  y( 0)et. 
Since  y  is  bounded,  we  must  have  0  =  y( 0)  =  (0,  ip)  for  all  <p  E  Dom(A), 
which  implies  that  -0  =  0.  Thus,  ker(A*  —  il)  =  {0},  and  by  a  similar 
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argument  ker(A*  +  il)  =  {0}.  This  shows  (Corollary  9.22)  that  A  is  essen¬ 
tially  self-adjoint. 

We  can  now  construct  a  strongly  continuous  unitary  group  V(-)  by  set¬ 
ting  V(t)  =  elAClt .  To  show  that  V(-)  =  £/(•),  take  0  G  Dom(A)  C 
Dom (Acl)  and  set  w(t)  =  —  V(t)fj.  By  Proposition  10.14,  the  in¬ 

finitesimal  generator  of  V(j  is  Acl .  Thus,  applying  Lemma  10.17  to  both 
U(-)  and  y(-),  we  have 

=  iAU(t)fj  —  iAV(t)^p 

C LL 

—  iAw(t ), 


where  the  limit  defining  dw/dt  is  taken  in  the  norm  topology  on  H.  Thus, 


d 

dt 


(' iAw(t),w(t ))  +  (w(t),iAw(t)) 

—i  ( Aw(t ),  w(t))  +  i  (w(t),  Aw(t)) 


because  A  is  symmetric.  Since  also  w( 0)  =  0,  we  conclude  that  w(t)  =  0 
for  all  t.  Thus,  U(-)  and  V(-)  agree  on  a  dense  subspace  and  hence  on  all 
of  H. 

We  now  know  that  U(t)  =  elA  t.  It  then  follows  from  Points  2  and 
3  of  Proposition  10.14  that  the  infinitesimal  generator  of  [/(•)  (namely 
A)  is  precisely  Acl .  That  is,  A  =  Acl  and  U{t)  =  elAt .  Furthermore,  we 
have  already  shown  that  A  is  essentially  self-adjoint  and  we  now  know 
that  A  =  Acl ,  so  A  is  actually  self-adjoint.  Finally,  if  B  is  any  self-adjoint 
operator  for  which  U(t)  =  elBt ,  then  by  Proposition  10.14,  B  must  be  the 
infinitesimal  generator  of  [/(•),  i.e.,  B  —  A.  ■ 


10.3  The  Spectral  Theorem  for  Bounded  Normal 
Operators 

We  are  going  to  prove  the  spectral  theorem  for  an  unbounded  self-adjoint 
operator  by  reducing  it  to  the  spectral  theorem  for  a  bounded  operator. 
The  reduction,  however,  will  not  be  to  a  bounded  self-adjoint  operator,  but 
rather  to  a  unitary  operator.  Although  we  proved  the  spectral  theorem  only 
for  bounded  self-adjoint  operators,  the  theorem  applies  more  generally  to 
bounded  normal  operators.  (See  Exercise  4  in  Chap.  7  for  the  matrix  case.) 

Definition  10.19  A  bounded  operator  A  on  H  is  normal  if  A  commutes 
with  its  adjoint:  A  A*  =  A*  A. 

Every  bounded  self-adjoint  operator  is  obviously  normal.  Other  examples 
of  normal  operators  are  skew- self- adjoint  operators  (A*  =  —A)  and  unitary 
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operators  ( UU *  =  U*U  =  I).  The  spectrum  of  a  bounded  normal  operator 
need  not  be  contained  in  R,  but  can  be  an  arbitrary  closed,  bounded, 
nonempty  subset  of  C.  On  the  other  hand,  if  U  is  unitary,  then  the  spectrum 
of  U  is  contained  in  the  unit  circle  (Exercise  6  in  Chap.  7). 

In  this  section,  we  consider  the  spectral  theorem  for  a  bounded  normal 
operator  A.  The  statements  of  the  two  versions  of  the  theorem  are  precisely 
the  same  as  in  the  self-adjoint  case,  except  that  cr(A)  is  no  longer  necessarily 
contained  in  the  real  line.  Almost  all  of  the  proofs  of  these  results  are  the 
same  as  in  the  self-adjoint  case;  we  will,  therefore,  consider  only  those  steps 
where  some  modification  in  the  argument  is  required. 


Theorem  10.20  Suppose  A  E  B(  H)  is  normal.  Then  there  exists  a  unique 
projection-valued  measure  pA  on  the  Borel  cr -algebra  in  cr(A),  with  values 
in  6(H),  such  that 


A  dfiA{ A)  =  A. 


Furthermore,  for  any  measurable  set  E  C  cr(A),  Rang e(pA(E))  is  invariant 
under  A  and  A*. 


Once  we  have  the  projection- valued  measure  pA,  we  can  define  a  func¬ 
tional  calculus  for  A,  as  in  the  self-adjoint  case,  by  setting 


f(A)  =  [  /(A)  dtiA{ A) 

Ja(A) 

for  any  bounded  measurable  function  /  on  cr  (A). 

We  can  also  define  spectral  subspaces ,  as  in  the  self-adjoint  case,  by  setting 

Ve  :=  Rang e(fiA(E)) 

for  each  Borel  set  E  C  <7  (A).  These  spectral  subspaces  have  precisely  the 
same  properties  (with  the  same  proofs)  as  in  Proposition  7.15,  with  the 
following  two  exceptions.  First,  the  assertion  that  Ve  is  invariant  under  A 
should  be  replaced  by  the  assertion  that  Ve  is  invariant  under  A  and  A*. 
Second,  in  Point  2  of  the  proposition,  the  condition  E  C  [Aq  —  £,  Ao  +  e 
should  be  replaced  by  E  C  D( Ao,e),  where  D(z,r)  denotes  the  disk  of 
radius  r  in  C  centered  at  z. 

Meanwhile,  the  spectral  theorem  in  its  direct  integral  and  multiplica¬ 
tion  operator  versions  also  holds  for  a  bounded  normal  operator  A.  The 
statements  are  identical  to  the  self-adjoint  case,  except  that  we  no  longer 
assume  cr(A)  C  R  and  we  no  longer  assume  that  the  function  h  in  the 
multiplication  operator  version  is  real  valued. 

Let  us  recall  the  two  stages  in  the  proof  of  the  spectral  theorem  (first 
version)  for  bounded  self-adjoint  operators.  The  first  stage  is  the  construc¬ 
tion  of  the  continuous  functional  calculus.  The  steps  in  this  construction  are 
(1)  the  equality  of  the  norm  and  spectral  radius  for  self-adjoint  operators, 
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(2)  the  spectral  mapping  theorem,  and  (3)  the  Stone- Weierstrass  theorem. 
The  second  stage  is  a  sort  of  operator-valued  Riesz  representation  theo¬ 
rem,  which  we  prove  by  reducing  it  to  the  ordinary  Riesz  representation 
theorem  using  quadratic  forms.  In  generalizing  from  bounded  self-adjoint 
to  bounded  normal  operators,  the  second  stage  of  the  proof  is  precisely  the 
same  as  in  the  self-adjoint  case.  In  the  first  stage,  however,  there  are  some 
additional  ideas  needed  in  each  step  of  the  argument. 

There  is  a  relatively  simple  argument  that  reduces  the  equality  of  norm 
and  spectral  radius  for  normal  operators  to  the  self-adjoint  case.  Mean¬ 
while,  since  the  spectral  mapping  theorem,  as  stated  in  Chap.  8,  already 
holds  for  arbitrary  bounded  operators,  it  appears  that  no  change  is  needed 
in  this  step.  We  must  think,  however,  about  the  proper  notion  of  “polyno¬ 
mial.”  For  a  general  normal  operator  A,  the  spectrum  of  A  is  not  contained 
in  R,  and,  thus,  powers  of  A  are  complex- valued  functions  on  cr(A).  We 
must,  therefore,  use  the  complex- valued  version  of  the  Stone- Weierstrass 
theorem  (Appendix  A.3.1),  which  requires  that  our  algebra  of  functions  be 
closed  under  complex-conjugation.  This  means  that  we  need  to  consider 
polynomials  in  A  and  A,  that  is,  linear  combinations  of  functions  of  the 
form  AmAn. 

What  we  need,  then,  is  a  form  of  the  spectral  mapping  theorem  that 
applies  to  this  sort  of  polynomial.  On  the  operator  side,  the  natural  coun¬ 
terpart  to  the  complex  conjugate  of  a  function  is  the  adjoint  of  an  opera¬ 
tor.  Thus,  applying  the  function  AmAn  to  a  normal  operator  A  should  give 
Am(A*)n.  The  desired  “spectral  mapping  theorem”  is  then  the  following: 
If  p  is  a  polynomial  in  two  variables,  and  A  is  a  bounded  normal  operator, 
then 

cr(p(A,  A*))  =  {p(A,  A)  |  A  E  <r(A)}  .  (10.21) 

This  statement  is  true  (Theorem  10.23),  but  its  proof  is  not  nearly  as 
simple  as  the  proof  of  the  ordinary  spectral  mapping  theorem.  One  way 
to  prove  (10.21)  is  to  use  the  theory  of  commutative  C*-algebras,  as  in 
[33].  (See  Theorem  11.19  in  [33]  along  with  the  assertion  on  p.  321  that 
the  spectrum  of  an  element  is  independent  of  the  algebra  containing  that 
element.)  Another  approach  is  the  direct  argument  found  in  Bernau  [3], 
which  uses  no  fancy  machinery  but  which  is  long  and  not  easily  motivated. 
A  third  approach  is  to  use  the  spectral  theorem  for  bounded  self-adjoint 
operators  to  help  us  prove  (10.21);  this  is  the  approach  we  will  follow. 

We  begin  with  the  equality  of  norm  and  spectral  radius  and  then  turn 
to  (10.21). 

Proposition  10.21  If  A  E  £>( H)  is  normal,  then 

PH  =  R(A). 

Lemma  10.22  If  A  and  B  are  commuting  elements  ofB{  H),  then 

R(AB)  <  R(A)R(B). 
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Proof.  If  A  is  any  bounded  operator,  the  proof  of  Lemma  8.1  shows  that 
for  any  real  number  T  with  T  >  R(A ),  we  have 


lim 

m— t>oo 


0. 


If  A  and  B  are  two  commuting  bounded  operators  and  S  and  T  are  two 
real  numbers,  with  S  >  R(A)  and  T  >  R(B ),  then 


|  (AB)m 


|  j^rri  jjrn 


SmT 


< 


||  Am 


B 


m 


m 


SmT 


m 


Thus, 


lim 

m— t>oo 


(AB)m 


(10.22) 


Meanwhile,  if  we  apply  the  expression  for  the  resolvent  in  the  proof  of 
Lemma  8.1  to  AB ,  we  obtain 


°°  A171 ft771 

(AB  -  X)-1  =  -  *£  •  <10-23) 

m— 0 

since  A  and  B  commute.  For  any  Ai  with  |Ai|  >  R(A)R(B ),  take  A2  with 
I Ai|  >  | A2 1  >  R(A)R(B).  The  terms  in  (10.23)  with  A  =  A2  tend  to  zero 
by  (10.22),  which  means  that  (10.23)  converges  with  A  =  Ai.  Thus,  Ai  is 
in  the  resolvent  set  of  AB.  m 

Proof  of  Proposition  10.21.  For  any  bounded  operator,  ||A||  >  R(A) 
(Proposition  7.5).  To  get  the  inequality  in  the  other  direction,  recall  (Propo¬ 
sition  7.2)  that  ||A||2  =  ||A*A||.  Note  also  that  A* A  is  self-adjoint,  since  its 
adjoint  is  A* A**  =  A* A.  Thus,  if  A  and  A*  commute,  we  have 

PH2  =  \\A*A\\  =  R(A*A)  <  R(A*)R(A) 

<  ||A*||  R(A)  =  P||i?P). 


Here  we  have  used  Lemmas  8.1  and  10.22  and  the  general  inequality  be¬ 
tween  norm  and  spectral  radius.  Dividing  by  ||  A||  gives  ||  A||  <  U(A),  unless 
|| A ||  =  0,  in  which  case  the  desired  inequality  is  trivially  satisfied.  ■ 

Theorem  10.23  If  A  E  B{  H)  is  normal ,  then  for  any  polynomial  p  in  two 
variables,  we  have 


a(p(A,A*))  =  {p(A,  A) |  A  G  a(A)}  . 

If,  for  example,  p( A,  A)  =  A2 A3,  then  p(A,  A*)  =  A2 (A*)3.  Note  that  since 
A  and  A*  are  assumed  to  commute,  the  map  sending  the  polynomial  p{ A,  A) 
to  p(A,  A*)  is  an  algebra  homomorphism.  That  is  to  say,  (pq)(A,  A*)  = 
p(A,  A*) q (A,  A*).  This  would  not  be  the  case  if  A  did  not  commute  with  A*. 
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We  begin  by  proving  Theorem  10.23  in  the  case  that  A  is  a  normal 
matrix.  Although  the  matrix  case  is  quite  simple,  it  provides  an  outline  for 
our  assault  on  the  general  result. 

Proof  of  Theorem  10.23  in  the  Matrix  Case.  For  matrices,  the  spec¬ 
trum  is  nothing  but  the  set  of  eigenvalues.  If  A  commutes  with  A*,  then 
for  any  A  G  C, 

(pi*  -  XI)ip,  (A*  -  XI)ip)  =  (P  (A  -  XI) (A*  -  X m) 

=  (iP,(A*  -XI)(A-Xm) 

=  ((A  —  XI)ip,  (A  —  XI)ip)  (10.24) 


Thus,  if  if  is  an  eigenvalue  for  A  with  eigenvalue  A,  if  is  automatically 
an  eigenvalue  for  A*  with  eigenvalue  A.  It  then  easily  follows  that  if  is  an 
eigenvector  for  p(A,  A*)  with  eigenvalue  p( A,  A). 

In  the  other  direction,  suppose  p  is  an  eigenvalue  for  p(A,  A*)  and  let  W 
denote  the  /i-eigenspace  for  p(A,  A*).  Since  A  and  A*  commute  with  each 
other,  they  also  commute  with  p(A,  A*).  Thus,  A  and  A*  preserve  W,  as 
is  easily  verified,  and  the  operator  A\w  will  have  some  eigenvector  if  with 
eigenvalue  A.  Since  Aif  =  A if,  then,  as  in  (10.24),  A*  if  =  Xif  and  so 

p(A,A*)ip  =p(X,X)4>. 


Since  also  p(A,  A*)if  =  pip,  by  assumption,  we  have  p  =  p{ A,  A),  where  A 
is  an  eigenvalue  for  A.  ■ 

We  now  attempt  to  run  the  same  argument  for  a  bounded  normal  op¬ 
erator  on  H,  replacing  “eigenvector”  with  “almost  eigenvector,”  where  if 
is  an  e-almost  eigenvector  for  if  if  ||(A  — A/)^||  is  less  than  £||^||.  The 
main  difficulty  with  this  approach  is  that  for  a  given  eigenvalue  A,  the  set 
of  e- almost  eigenvectors  is  not  a  vector  space.  To  surmount  this  difficulty, 
we  will  use  the  spectral  theorem  for  the  self-adjoint  operator  B*B ,  where 
B  =  p(A,  A*)  —  /i/,  with  fi  G  cr(p(A,  A*)).  We  will  construct  a  spectral 
subspace  W  for  B*B  such  that  W  is  invariant  under  A  and  A*  and  such 
that  each  element  of  W  is  an  ^-almost  eigenvector  for  p(A,  A*)  with  eigen¬ 
value  p.  (Note,  however,  that  we  are  not  claiming  that  W  contains  all  the 
e-almost  eigenvectors  for  p(A,  A*).) 


Definition  10.24  If  A  G  B( H),  then  an  e -almost  eigenvector  for  A 

with  eigenvalue  A  C  is  a  nonzero  vector  if  G  H  such  that 


|| (A  —  XI)ip\\  <  e  \\if 


We  now  establish  three  lemmas  about  almost  eigenvectors,  the  last  of 
which  makes  use  of  the  spectral  theorem  for  bounded  self-adjoint  operators. 
With  these  lemmas  in  hand,  we  will  have  a  clear  path  to  imitate  the  proof 
of  the  matrix  case  of  Theorem  10.23. 
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Lemma  10.25  Suppose  A  E  B(H)  is  normal. 

1.  If  ip  is  an  e-almost  eigenvector  for  A  with  eigenvalue  X,  then  ip  is  an 
e-almost  eigenvector  for  A*  with  eigenvalue  X. 

2.  A  number  A  E  C  belongs  to  cr(A)  if  and  only  if  for  all  £  >  0,  there 
exists  an  £-almost  eigenvector  with  eigenvalue  A. 

Proof.  Point  1  follows  immediately  from  (10.24),  which  holds  for  bounded 
normal  operators,  not  just  matrices.  For  Point  2,  suppose  that  an  e-almost 
eigenvector  with  eigenvalue  A  exists  for  all  e  >  0.  Then  A  —  XI  cannot  have 
a  bounded  inverse,  and  so  A  E  cr(A).  In  the  other  direction,  if  there  is  some 
e  >  0  for  which  no  e-almost  eigenvector  exists,  then 


II (A  —  A/)^||  >  e  \\fj 


(10.25) 


for  all  ip  E  H,  showing  that  A  —  XI  is  injective.  By  (10.24),  the  same 
inequality  hods  with  A  — XI  replaced  by  A *  —  XI.  Thus,  A *  —  XI  is  injective, 
so  by  Proposition  7.3,  the  range  of  A  —  XI  is  dense  in  H.  Using  (10.25)  as 
in  the  proof  of  Proposition  7.7,  it  is  easily  seen  that  the  range  of  A  —  XI  is 
also  closed,  hence  all  of  H.  Thus,  ( A  —  XI)  is  invertible  and  the  inverse  is 
bounded,  by  (10.25).  ■ 

Lemma  10.26  Suppose  A  E  B{  H)  is  normal.  Then  for  each  polynomial 
p  in  two  variables  and  each  number  X  E  C;  there  is  a  constant  C  such 
that  if  is  an  £-almost  eigenvector  for  A  with  eigenvalue  X,  then  ip  is  a 
(Ce) -almost  eigenvector  for  p(A,  A *)  with  eigenvalue  p( A,  A). 

Proof.  We  decompose  p(A,  A*)  —  p{ A,  A)/  into  a  linear  combination  of 
terms  of  the  form  Ak(A*)1  —  XkXl  and  we  estimate  such  terms  by  induction 
on  k  +  l.  If  k  =  1  and  l  =  0,  there  is  nothing  to  prove,  and  if  k  =  0  and 
l  =  1,  we  use  (10.24).  Assume  now  that  we  have  established  the  desired 
result  for  k  +  l  =  N  and  consider  a  case  with  k  1  =  N  +  1.  If  /c  >  0,  we 
write 


(Ak(A*y  -  XkXl)  ip  =  A^iA*)1  {A  -  XI)  ip 

+  X(Ak~l{A*)1  -Xk~lXll)  ip. 


(10.26) 


Since  ip  is  an  £-almost  eigenvector  and  A  and  A*  are  bounded,  the  norm  of 
the  first  term  on  the  right-hand  side  of  (10.26)  is  at  most  C\£.  By  induction, 
the  norm  of  the  second  term  on  the  right-hand  side  of  (10.26)  is  at  most 
A|  C2£-  Thus,  the  norm  of  the  left-hand  side  of  (10.26)  is  at  most  (ci  + 
X\c2)e.  A  similar  analysis  holds  if  k  =  0,  in  which  case  l  >  0.  ■ 


Lemma  10.27  Let  A  E  6(H)  be  normal ,  let  p  be  a  polynomial  in  two 
variables,  and  let  p  be  an  element  of  the  spectrum  of  p(A,  A*).  Then  for 
all  £  >  0,  there  exists  a  nonzero  closed  subspace  W£  of  H  such  that  W£  is 
invariant  under  A  and  A*  and  such  that  every  nonzero  element  of  W£  is 
an  £-almost  eigenvector  for  p{ A,  A*)  with  eigenvalue  p. 
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Proof.  Fix  some  p  in  the  spectrum  of  p(A,  A*)  and  let  B  =  p(A,  A*)  —  pi. 
Then  B  is  normal  and  0  belongs  to  the  spectrum  of  B.  Using  Point  2  of 
Lemma  10.25  and  Lemma  10.26,  we  see  that  0  belongs  to  the  spectrum  of 
the  self-adjoint  operator  B*B.  We  apply  the  spectral  theorem  to  B*B  and 
we  let  W£  be  the  spectral  subspace  for  B*  B  corresponding  to  the  interval 
(— £2,e2).  By  Proposition  7.15,  W£  is  nonzero  and  invariant  under  B*B , 
and  the  restriction  of  B*B  to  W£  has  norm  at  most  £2.  Thus,  for  all  ip  E  W£ 
we  have 


pip,  B*  Bip)  <  \\pj\\  WB'BipW  <  £ 2  \\ip 


Since  B  =  p(A,A*)  —  pi,  this  shows  that  every  nonzero  element  of  W£ 
is  an  ^-almost  eigenvector  for  p(A,A*)  with  eigenvalue  p.  Furthermore,  A 
and  A *  commute  with  B*B  and  thus  they  preserve  each  spectral  subspace 
of  B*B  (Proposition  7.16)  including  W£ .  ■ 

Proof  of  Theorem  10.23.  Suppose  first  that  A  belongs  to  the  spectrum  of 
A.  By  Point  2  of  Lemma  10.25,  A  has  ^-almost  eigenvalues  with  eigenvalue 
A  for  every  e  >  0.  Lemma  10.26  then  shows  that  p(A,  A *)  has  (CT)-almost 
eigenvectors  with  eigenvalue  p{ A,  A)  for  every  e  >  0,  which  shows  that 
p( A,  A)  is  in  the  spectrum  of  p(A,  A *). 

In  the  other  direction,  suppose  that  p  is  in  the  spectrum  of  p(A,A*). 
For  any  e  >  0,  we  consider  the  nonzero  subspace  W£  in  Lemma  10.27, 
which  is  invariant  under  A  and  A*.  The  restriction  of  A  to  W£  is  again  a 
normal  operator  (Exercise  8),  and  A\We  has  nonempty  spectrum  (Propo¬ 
sition  7.5).  If  we  fix  some  A  E  a(A\we ),  Lemma  10.25  tells  us  that  there 
exists  an  e-almost  eigenvector  ip  for  A  in  W£ .  By  Lemma  10.26,  ijj  is  a  (Ce)- 
almost  eigenvector  for  p(A,A*)  with  eigenvalue  p( A,  A).  Meanwhile,  since 
ip  E  W£,  the  same  vector  ip  is  also  an  ^-almost  eigenvector  for  p(A,A*) 
with  eigenvalue  p.  It  then  is  easy  to  see  (Exercise  10)  that 


p  —  p( A,  A)  I  Cs  T  £. 


(10.27) 


Since  (10.27)  holds  for  all  £  >  0,  we  can  find  a  sequence  An  of  points  in 
cr (A)  such  that  p{ An,An)  —>  p.  Since  cr(AL)  is  compact,  we  can  pass  to  a 
subsequence  of  the  An’s  that  is  convergent  to  some  A  E  cr(A),  and  this  A 
will  satisfy  p( A,  A)  =  p.  ■ 

Combining  Theorem  10.23  with  the  equality  of  the  norm  and  spectral 
radius  for  normal  operators  (Proposition  10.21),  we  have  the  following  re¬ 
sult.  If  A  E  B( H)  is  normal  and  p  is  a  polynomial  in  two  variables,  then 


\p(A,A*) 


sup  |p(A,  A) 

Aecr(A) 


The  map  p  ^  p(A,A*)  has  the  property  that  p(A,A*)  =  (p(A,A*))*, 
where  the  polynomial  p  is  the  complex-conjugate  of  p.  In  particular,  if  p 
takes  only  real  values  on  a(A ),  then  p(A,A*)  is  self-adjoint. 
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By  the  complex- valued  version  of  the  Stone- Weierstrass  theorem  (A.  12), 
polynomials  in  A  and  A  are  dense  in  C(cr(A);C),  the  space  of  continuous 
complex- valued  functions  on  cr(A).  Thus,  the  BLT  theorem  (Theorem  A. 36) 
tells  that  we  can  extend  the  map  p  p(A,A*)  to  an  isometric  map  of 
C(cr(A);  C)  into  6(H).  This  extension,  which  we  call  the  continuous  func¬ 
tional  calculus  for  A,  has  all  the  same  properties  as  in  the  self-adjoint  case. 

Now  that  the  continuous  functional  calculus  for  normal  operators  has 
been  established,  the  proof  of  the  spectral  theorem — in  any  of  its  various 
versions — proceeds  exactly  as  in  the  self-adjoint  case.  There  is  no  need, 
then,  to  repeat  the  arguments  given  in  Chap.  8. 


10.4  Proof  of  the  Spectral  Theorem  for  Unbounded 
Self-Adjoint  Operators 

To  prove  the  spectral  theorem  for  an  unbounded  self-adjoint  operator  A , 
we  will  construct  from  A  a  certain  unitary  (and  thus  normal)  operator 
U .  We  then  apply  the  spectral  theorem  for  bounded  normal  operators  to 
U  and  translate  this  result  into  the  desired  result  for  A.  To  motivate  the 
construction  of  Z7,  consider  the  function 

C(x)  :=  x  G  R.  (10.28) 

x  —  i 

It  is  a  simple  matter  to  check  that  C  maps  R  injectively  onto  61\{1},  with 
inverse  given  by 

?  /  -j—  1 

D(u)  :=  i - -,  u  £  S,1\{1}.  (10.29) 

Wj  j_ 

Furthermore,  we  have  lim^^^oo  C(x)  =  1.  The  function  C(x)  in  (10.28)  is 
the  simplest  bounded,  injective  function  one  can  define  on  R. 

We  wish  to  apply  the  map  C  to  a  self-adjoint  operator  A.  If  A  is  bounded 
and  self-adjoint,  it  is  straightforward  to  check  that  the  operator  (A+iI)(A— 
il)-1  is  unitary  (Exercise  5).  Even  in  the  unbounded  case,  it  is  possible  to 
make  sense  of  the  operator  U  :=  (7(A),  and  we  can  recover  A  from  [/,  by 
(essentially)  applying  D.  The  operator  U  is  unitary  and  is  known  as  the 
Cayley  transform  of  A. 

Recall  that  if  A  is  self-adjoint,  then  i  is  in  the  resolvent  set  of  A  and  the 
operator  (A  —  il )_1  maps  H  into  Dom(A). 

Theorem  10.28  (Cayley  Transform)  If  A  is  a  self-adjoint  operator  on 
H,  let  U  be  the  operator  defined  by 

Ufj  =  (A  +  7T)(A-i/)-V 


Then  the  following  results  hold. 
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1.  The  operator  U  is  a  unitary  operator  on  H. 

2.  The  operator  U  —  I  is  injective. 

3.  The  range  of  the  operator  U  —  I  is  equal  to  Dom(A)  and  for  all  pj  E 
Range (U  —  I)  we  have 

Aip  =  i(U  +  I)(U  -  (10.30) 

According  to  Point  2,  U  —  I  is  injective,  while  according  to  Point  3,  the 
range  of  U  —  I  is  Dom(A).  Thus,  in  (10.30),  the  expression  (U  — 1)~1  refers 
to  the  inverse  of  the  one-to-one  and  onto  map  U  —  I  :  H  Dom(A).  We 
are  not  claiming  that  1  is  in  the  resolvent  set  of  U.  That  is  to  say,  (U  —  I)~l 
is  not  a  bounded  operator,  unless  Dom(A)  =  H,  which  occurs  only  if  A  is 
bounded. 

Proof.  The  resolvent  operator  (A  —  il)-1  must  be  injective,  because 

(. A  —  H)(A  —  il)~lrip  =  ip 

for  all  pj  E  H.  Furthermore,  ( A  —  il)-1  maps  H  onto  Dom(A),  because 

pj  =  (A  —  z/)_1(A  —  il)pj 

for  all  ip  E  Dom(A).  Since  —i  is  also  in  the  resolvent  set  of  A,  similar 
reasoning  shows  that  A  +  il  maps  Dom(A)  injectively  onto  H.  Thus,  U  is 
the  composition  of  one  operator  that  maps  H  injectively  onto  Dom(A)  and 
another  operator  that  maps  Dom(A)  injectively  onto  H,  so  that  U  maps 
H  injectively  onto  H. 

Now,  for  any  <f>  E  Dom(A)  we  have 

(( A  +  i/)0,  ( A  +  il)4>)  =  (A(f),  A<f>)  +  (0,  <f>) 

=  ((A  -  il )0,  (A  -  iI)P) , 

because  of  a  familiar  cancellation  of  cross  terms.  Thus,  applying  this  with 
P=(A-iI)-1  fj  shows  that  for  any  ip  E  H,  we  have 

((A  +  iI)(A  -  (A  +  H)(A  -  il)  'r; 

=  ((A  —  H)(A  —  (A  —  H)(A  —  il)~1ip ) 

=  (V^)  • 

Thus,  U  is  one-to-one  and  onto  and  preserves  norms  and  is  therefore 
unitary. 

For  Point  2,  observe  that  for  any  ip  E  H,  we  have 

(A  +  iI)(A  -  il)~  V  =  ((A  -  il)  +  2 iI)(A  -  i/)“V 

=  ip  +  2  i(A  —  il)~1ip. 


(10.31) 
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Thus,  since  ( A  —  il)  1  is  injective,  we  cannot  have  Ufj  =  unless  -0  =  0. 
Finally,  for  Point  3,  (10.31)  says  that 

U  -I  =  2  i(A-il)-1,  (10.32) 

which  means  (by  the  reasoning  at  the  start  of  the  proof)  that  the  range  of 
U  —  I  is  Dom(A).  For  if  E  Dom(A),  we  then  have 

(U  +  I)(U  -  /)- V  =  hu  +  I) (A  -  11)4, 

ll 

=  Yi  It4  + +  (A  ~  u)\  V’ 

=  W, 

l 

which  establishes  Point  3.  ■ 

We  may  apply  the  spectral  theorem  for  bounded  normal  operators  to 
associate  a  projection-valued  measure  pu  to  U.  We  will  then  transfer  this 
measure  from  ^^{OjtoMby  means  of  the  map  D  in  (10.29)  to  obtain  the 
desired  projection-valued  measure  fiA  for  A. 


Proposition  10.29  Let  A  be  a  self-adjoint  operator  on  H,  let  U  be  the  uni¬ 
tary  operator  in  Theorem  10.28 ,  and  let  D  :  S'1\{0}  -T  R  be  as  in  (10.29). 
Then 

A  =  D(U ),  (10.33) 

where  D(U )  is  defined  by  the  functional  calculus  for  U . 


More  precisely,  D(U)  =  D(X)  dpu ( A),  where  pu  is  the  projection¬ 
valued  measure  associated  to  U  by  the  spectral  theorem  for  bounded  normal 
operators.  Note  that  by  Point  2  of  Theorem  10.28,  1  is  not  an  eigenvalue  for 
U  and  thus  fj,u ({1})  =  0.  Thus,  D  is  an  almost-everywhere-defined  function 
on  <j(f7) ,  even  if  1  E  cr(A).  As  always,  the  equality  in  (10.33)  includes 
equality  of  domains,  where  the  domain  of  D  dpu  is  the  space  Wd  in 
Proposition  10.1. 

Proposition  10.29  should  certainly  be  plausible  in  light  of  the  previously 
established  formula  (10.30)  for  A  in  terms  of  U. 

Proof.  Suppose  E  is  a  Borel  subset  of  Sfl\{0}  such  that  the  closure  of  E 
does  not  contain  1,  and  let  Ve  =  Rang e(/_iu (E))  be  the  associated  spectral 
subspace.  Then  the  spectrum  of  U\E  is  contained  in  E,  which  means  that 
the  functions  u  D(u)  and  u  l/(u  —  l)  are  bounded  on  a(U\v  ).  Now, 
by  comparing  the  quadratic  forms,  we  can  see  that  D(U)\V  =  D{U\V  ). 
Then  by  the  multiplicativity  of  the  functional  calculus  for  U  on  bounded 
functions,  we  have 


D{u)ip  =  i{u  +  i)(u  -  rrV 


for  all  E  Ve-  Thus,  by  Point  3  of  Theorem  10.28,  D(U )  agrees  with  A 
on  Ve- 


10.4  Proof  of  the  Spectral  Theorem  for  Unbounded  Self-Adjoint  Operators 


223 


Meanwhile,  if  we  decompose  S'1\{0}  as  the  disjoint  union  of  sets  En 
for  which  En  does  not  contain  1,  then  H  is  the  Hilbert  space  direct  sum 
of  the  subspaces  Ve71  •  Now,  A  and  (by  Proposition  10.3)  D(U )  are  both 
self-adjoint.  Furthermore,  these  operators  agree  on  the  finite  direct  sum 
of  the  VeJ s  and  they  are  essentially  self-adjoint  on  this  finite  sum,  by 
Example  9.26.  Thus,  A  and  D(U )  must  be  equal  (with  equality  of  domain). 

■ 

Theorem  10.30  Define  a  projection- valued  measure  pA  on  R  by 

=  nu(C(E)).  (10.34) 

Then 

A=  I  A  d/jA( A),  (10.35) 

Jr 

where  pu  is  the  projection-valued  measure  coming  from  the  spectral  theorem 
for  the  bounded  normal  operator  U  and  C  is  the  map  defined  in  (10.28). 

Proof.  If  for  any  E  H,  we  define  /jJ£(E)  =  (fij,  and  similarly  define 

fiA,  then  we  have 

4(E)  =  f^(C(E)). 

By  the  abstract  change  of  variables  theorem  from  measure  theory,  we  have 

f  A2  dfiA( A)  =  f  D(u )2  d/jb^(u),  (10.36) 

Jr  Jsx\{  o} 

since  D  is  the  inverse  map  to  C.  Thus,  the  two  operators  in  (10.35)  have 
the  same  domain.  Furthermore,  if  we  replace  A2  by  A  and  D{u )2  by  D(u) 
in  (10.36),  we  see  that  the  operators  in  (10.35)  are  also  equal.  ■ 

Proof  of  Theorem  10.4.  The  existence  of  the  desired  projection-valued 
measure  pA  is  the  content  of  Theorem  10.30.  To  establish  uniqueness,  sup¬ 
pose  nA  is  a  project  ion- valued  measure  on  a  (A)  such  that  f  A  dnA(  A)  =  A. 
Consider  then  the  operator  C(A)  as  defined  by  integration  of  the  function 
c( A)  against  vA.  Arguing  as  in  the  proof  of  Proposition  10.29,  we  can  see 
that  C(A),  computed  in  this  fashion,  coincides  with  the  operator  U  =  C(A) 
defined  as  the  product  of  ( A  +  il)  and  (A  —  z/)_1. 

Now  define  a  projection- valued  measure  nu  on  S'1  by  setting  nu  (E)  = 
^a(C-1(E)).  Then  as  in  the  proof  of  Theorem  10.30,  we  have  fsl  u  dvu 
(u)  =  U.  The  uniqueness  part  of  the  spectral  theorem  for  U  (Theorem  10.20) 
then  tells  us  that  vu  =  pu ,  from  which  it  follows  that  vA  =  pA .  ■ 

Proof  of  Theorem  10.9.  By  the  direct-integral  form  of  the  spectral  the¬ 
orem  for  U  =  C(A ),  there  is  a  family  of  Hilbert  spaces  Ha,  A  E  cr(U)  C  S'1, 
and  a  positive,  real-valued  measure  p  on  cr(U)  such  that  H  is  unitarily 
equivalent  to  Ha  d/a,  in  such  a  way  that  the  operator  U  corresponds  to 


224  10.  The  Spectral  Theorem  for  Unbounded  Self-Adjoint  Operators 

the  map  s( A)  i— >>  As(A).  Since  1  is  not  an  eigenvalue  for  U,  either  Hi  =  {0} 
or  fi({  1})  =  0.  Either  way,  Hi  is  “negligible”  in  the  direct  integral.  We  can 
then  define  a  family  of  Hilbert  spaces  Ka  :=  Hc'(a)?  for  A  G  cr(A)  C  R,  and 
a  measure  v  on  cr(A)  given  by  v(E)  =  fi(C(E)).  We  may  then  form  the 
direct  integral  fa(A^  Ka  dv.  This  direct  integral  is  unitarily  equivalent  in 

an  obvious  way  to  L(U)  ha  dfi.  We  wish  to  show,  then,  that  Ka  dv 
is  unitarily  equivalent  to  H  in  such  a  way  that  the  operator  A  corresponds 
to  the  (unbounded)  operator  mapping  s(A)  to  As  (A).  Since  the  argument 
is  similar  to  that  in  the  proof  of  Theorem  10.4,  we  omit  the  details. 

As  in  the  proof  of  Theorem  10.4,  the  uniqueness  in  Theorem  10.9  can 
be  reduced  to  the  uniqueness  for  the  direct-integral  form  of  the  spectral 
theorem  for  U.  ■ 

The  proof  of  the  multiplication  operator  form  of  the  spectral  theorem 
for  unbounded  operators  is  similar  to  the  preceding  proofs  and  is  omitted. 


10.5  Exercises 

1.  (a)  If  A  is  a  bounded  self-adjoint  operator,  show  that  U(t)  :=  elM 

is  continuous  in  the  operator  norm  topology. 

(b)  Using  the  spectral  theorem,  show  that  if  A  is  a  self-adjoint  op¬ 
erator  and  cr(A)  is  a  bounded  subset  of  R,  then  A  is  bounded. 

(c)  Suppose  A  is  a  self-adjoint  operator  that  is  not  bounded.  Show 
that  U(t)  :=  elAt  is  not  continuous  in  the  operator  norm 
topology. 

Hint :  Consider  pj  in  a  spectral  subspace  of  the  form  V(a0-£,a0+£U 
where  Ao  is  a  point  in  cr(A)  with  |Aq|  large. 

2.  Let  Pj  be  the  unbounded  self-adjoint  operator  defined  in  Sect.  9.8. 
Show  that  the  one-parameter  unitary  group  eltPj  generated  by  Pj  is 
given  by 

{eltPj  V;)(x)  =  -0(x  +  they) 

for  all  ^  G  L2(Mn),  where  e3  is  the  jth  element  of  the  standard  basis 
for  Mn. 

Hint:  First  determine  the  Fourier  transform  of  using  Propo¬ 

sition  9.32. 

3.  If  A  is  an  unbounded  self-adjoint  operator  on  H,  let  us  say  that  a 
family  ip(t)  of  elements  of  H  satisfies  the  equation 

dip 


dt 


iAip(t) 


(10.37) 
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in  the  strong  sense  if  each  0(£)  belongs  to  Dom(A)  and 


lim 

h—>  0 


0(t  +  h)  —  0(£) 

h 


iAip(t) 


for  every  t  G  R.  If  we  define  0(t)  by  0(t)  =  eltAip o?  for  some  0o  G  H, 
show  that  0(t)  satisfies  (10.37)  in  the  strong  sense  if  and  only  if  0O 
belongs  to  Dom(AL). 


4.  Suppose  A  is  an  unbounded  self-adjoint  operator  and  suppose  that 
there  exists  a  number  7  G  R  and  a  nonzero  vector  0  G  Dom(A)  such 
that 


—  7-0H  <  £  ||0 


for  some  £  >  0.  Show  that  there  exists  a  number  7  in  the  spectrum 
of  A  such  that  |y  —  <  e. 

Hint :  If  no  such  7  existed,  the  function  /(A)  :=  1/ 1 A  —  7 1  would 
satisfy  |/(A)|  <  1/e  for  all  A  G  cr(A).  Consider,  then,  the  operator 
f(A),  which  is  nothing  but  (A  —  7/)-1. 


5.  If  A  is  a  bounded  self-adjoint  operator,  show  that  the  operator  C(A) 
given  by 

C(A)  =  {A  +  iI)(A  -  il)-1 

is  unitary  and  that  1  is  in  the  resolvent  set  of  C(A).  Show  also  that 
A  can  be  recovered  from  C(A)  by  the  formula 

A  =  i(C(A)  +  I)(C(A)  -  I)-1 . 


6.  Show  that  Lemma  10.22  is  false  if  we  do  not  assume  that  A  and  B 
commute. 


7.  Let  Abe  a  normal  matrix  and  p  a  polynomial  in  two  variables.  Show 
by  example  that  an  eigenvector  for  p(A,A*)  is  not  necessarily  an 
eigenvector  for  A. 

Note :  Nevertheless,  the  proof  of  the  matrix  case  of  Theorem  10.23 
shows  that  if  p  is  an  eigenvalue  for  p(A,A*),  then  there  exists  some 
eigenvector  for  p(A,  A *)  with  eigenvalue  p  that  is  also  an  eigenvector 
for  A. 

8.  Suppose  A  G  6(H)  and  IT  is  a  closed  subspace  of  H  that  is  invariant 
under  A  and  A* . 


w 


(a)  Show  that  (A\w)*  =  A 

(b)  Show  that  if  A  is  normal,  the  restriction  of  A  to  IT  is  normal. 
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9.  (a)  Suppose  that  H  is  finite  dimensional,  A  is  a  normal  operator  on 

H,  and  W  is  a  subspace  of  H  that  is  invariant  under  A.  Show 
that  W  is  invariant  under  A*. 

(b)  Show  by  example  that  the  result  of  Part  (a)  is  false  if  H  is  infinite 
dimensional. 


10.  Given  A  £  B(U),  suppose  that  the  same  vector  ijj  is  an  ^-almost 
eigenvector  for  A  with  eigenvalue  A  and  a  (Ualmost  eigenvector  for  A 
with  eigenvalue  fi.  Show  that  |A  —  fi\  <  e  +  6. 


11 

The  Harmonic  Oscillator 


11.1  The  Role  of  the  Harmonic  Oscillator 

The  harmonic  oscillator  is  an  important  model  for  various  reasons.  In 
solid-state  physics,  for  example,  a  crystal  is  modeled  as  a  large  number 
of  coupled  harmonic  oscillators.  Using  the  notion  of  “normal  modes,”  this 
model  is  then  transformed  into  independent  one-dimensional  harmonic 
oscillators  with  different  frequencies.  In  the  quantum  mechanical  setting, 
the  excitations  of  the  different  normal  modes  are  called  phonons. 

A  free  quantum  field  theory  is  similarly  modeled  as  a  family  of  cou¬ 
pled  harmonic  oscillators,  except  that  in  the  field  theory  setting  we  have 
infinitely  many  of  the  oscillators.  Even  interacting  quantum  field  theo¬ 
ries  are  often  described  using  the  harmonic  oscillator  raising  and  lowering 
operators,  which  are  referred  to  as  creation  and  annihilation  operators  in 
the  context  of  field  theory. 

Our  approach  to  analyzing  the  harmonic  oscillator  also  introduces  the 
algebraic  approach  to  quantum  mechanics,  in  which  algebra  (commuta¬ 
tion  relations  between  various  operators)  substantially  replaces  analysis 
(differential  equations)  as  the  way  to  solve  quantum  systems.  Most  of  the 
effort  in  analyzing  the  harmonic  oscillator  occurs  in  the  algebraic  sec¬ 
tion  (Sect.  11.2),  with  the  remaining  analytic  issues  being  taken  care  of 
in  Sects.  11.3  and  11.4. 
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11.2  The  Algebraic  Approach 


In  this  section  we  will  derive  as  much  information  as  possible  about  the 
Hamiltonian  operator  for  a  quantum  harmonic  oscillator  using  only  the 
commutation  relation  between  the  position  and  momentum  operators, 


[x,  p }  =  m. 


(n.i) 


Here,  as  usual,  [•,  •]  denotes  the  commutator,  given  by  [A,  B }  =  AB  —  BA. 
We  consider,  then,  a  harmonic  oscillator  with  Hamiltonian  given  by 


<il2> 

where  k  is  a  positive  constant.  Our  goal  is  to  see  what  we  can  say  about 

A 

the  eigenvectors  and  eigenvalues  of  H  using  only  the  fact  that  X  and  P  are 
self-adjoint  operators  satisfying  (11.1),  without  making  use  of  the  actual 
formulas  for  these  operators. 

To  be  honest,  we  are  actually  assuming  certain  domain  conditions  regard¬ 
ing  the  operators  X  and  P,  in  addition  to  the  commutation  relation  (11.1), 
namely  that  the  vectors  in  Theorem  11.2  are  actually  in  the  domain  of 
X  and  P  (and  thus,  also,  in  the  domain  of  the  raising  and  lowering  opera¬ 
tors).  In  this  section,  we  follow  the  usual  physics  practice  of  assuming  that 
all  the  vectors  we  work  with  are  in  the  domain  of  all  the  relevant  opera¬ 
tors.  This  assumption  will  turn  out  to  be  correct  in  the  case  we  are  actually 
considering,  in  which  X  and  P  are  the  usual  position  and  momentum  op¬ 
erators  on  L2(M).  (See  Sect.  11.4.)  It  is  a  more  complicated  matter  to  work 
out  the  domain  conditions  that  must  be  imposed  on  two  self-adjoint  oper¬ 
ators  satisfying  (11.1)  in  order  for  the  argument  of  the  present  section  to 
be  valid.  We  will  come  back  to  this  issue  in  Chap.  14. 

Following,  again,  the  convention  in  the  physics  literature,  we  now  elimi¬ 
nate  the  spring  constant  k  in  favor  of  the  frequency  ce  =  y/c/m  of  the  cor¬ 
responding  classical  harmonic  oscillator.  [Solutions  to  Hamilton’s  equations 
with  classical  Hamiltonian  H{x^p)  equal  to  p2  / (2m)  +  kx2 / 2  are  sinusoidal 
with  frequency  \J k/m .]  Replacing  k  by  mo;2,  we  may  rewrite  (11.2)  as 


H  =  —  (P2  +  (mwl)2) . 

2  to  v  v  ’  ’ 

We  now  introduce  the  lowering  operator  a,  given  by 


(11.3) 


mooX  +  iP 


mu 


(11.4) 


and  its  adjoint  a*,  the  raising  operator,”  given  by 


muX  —  iP 


moo 


(11.5) 


11.2  The  Algebraic  Approach  229 


The  reason  for  the  terminology  “raising”  and  “lowering”  is  that  these 
operators  raise  and  lower  the  eigenvalue  for  the  Hamiltonian,  as  we  will 
see  shortly.  In  the  context  of  quantum  field  theory,  operators  very  much 
like  a  and  a*  are  called  creation  operators  and  annihilation  operators ,  re¬ 
spectively,  because  they  map  from  the  n-particle  space  to  either  the  (n-f-1)- 
particle  space  or  the  (n— l)-particle  space,  thus  “creating”  or  “annihilating” 
a  particle. 

In  the  world  of  noncommuting  operators,  (A  —  B)(A  +  B)  does  not  equal 
A2  —  P2;  rather, 

(A  -  B)(A  +  B)=A2-B2  +  [A,  B\ . 


Thus,  if  we  compute  a* a  using  (11.1)  we  get 


a*  a 


— - ( {mu  X ) 2 

2  Hmu  ' 


i  c>2 


From  this  we  obtain 

H  =  Hu  ( a* a  +  -I 

V  2 

The  \l  on  the  right-hand  side  of  this  expression  should  be  thought  of  as  a 
“quantum  correction,”  in  that  there  would  be  no  such  term  in  the  analogous 
formula  for  the  classical  Hamiltonian. 

It  suffices  to  work  out  the  spectral  properties  (eigenvectors  and 

/\ 

eigenvalues)  of  a* a.  To  get  back  to  H ,  we  keep  the  same  eigenvectors  and 
simply  add  1/2  to  the  eigenvalues  and  then  multiply  by  Hu.  We  compute 
that 


a,  a' 


* 


1 


2  Hmu 

1 

2  Hmu 

I. 


([muX,  — iP }  +  [iP,  muX}) 
{Hmu  I  +  Hmu  I ) 


From  this,  it  is  easy  to  compute  that 


a,  a* a 

%  % 

a  ,  a  a 


a 


=  —a 


* 


(11.6) 


(11.7) 

(11.8) 


Now,  a*a  is  self-adjoint  (or,  at  the  least,  symmetric)  because  (a*a)*  = 
a* a**  =  a* a.  This  operator  is  also  non- negative,  because 


('ip,a*a,ip)  =  (a'lp^a'ip)  >  0 

for  all  ip.  We  now  come  to  a  key  computation,  which  demonstrates  the 
utility  of  the  operators  a  and  a*. 
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Proposition  11.1  Suppose  that  ip  is  an  eigenvector  for  a* a  with 
eigenvalue  A.  Then 


a*a{aip )  =  (A  —  1  )aip 
a*  a{a*  ip)  =  (A  +  1  )a*ip. 

Thus,  either  aip  is  zero  or  aip  is  an  eigenvector  for  a* a  with  eigenvalue 
A  —  1.  Similarly,  either  a* ip  is  zero  or  a* ip  is  an  eigenvector  for  a* a  with 
eigenvalue  A  +  1.  That  is  to  say,  the  operators  a*  and  a  raise  and  lower  the 
eigenvalues  of  a* a,  respectively. 

Proof.  Using  the  commutation  relation  (11.7),  we  find  that 

a*  a{aip )  =  (a (a* a)  —  a)  ip  =  (A  —  1  )aip. 

A  similar  calculation  applies  to  a*ip ,  using  (11.8).  ■ 

If  ip  is  an  eigenvector  for  a* a  with  eigenvalue  A,  then 

A  (ip,  ip)  =  {ip,  a* aip)  =  {aip,  aip)  >  0, 

which  means  that  A  >  0.  Let  us  assume  that  a* a  has  at  least  one  eigenvec¬ 
tor  ip,  with  eigenvalue  A,  which  we  expect  since  a* a  is  self-adjoint.  Since 
a  lowers  the  eigenvalue  of  a*a,  if  we  apply  a  repeatedly  to  ip,  we  must 
eventually  get  zero.  After  all,  if  anip  were  always  nonzero,  these  vectors 
would  be,  for  large  n,  eigenvectors  for  a*a  with  negative  eigenvalue,  which 
we  have  seen  is  impossible. 

It  follows  that  there  exists  some  N  >  0  such  that  aNip  7^  0  but  aN+1ip= 0. 
If  we  define  ipo  by 

Ipo  ■=  aNip, 

then  aipo  =  0,  which  means  that  a*  aipo  =  0.  Thus,  ipo  is  an  eigenvector  for 
a*  a  with  eigenvalue  0.  (It  follows  that  the  original  eigenvalue  A  must  have 
been  equal  to  the  non-negative  integer  N.) 

The  conclusion  is  this:  Provided  that  a* a  has  at  least  one  eigenvector  ip, 
we  can  find  a  nonzero  vector  ipo  such  that 

aipo  =  a*  aipo  =  0. 

Since  a* a  cannot  have  negative  eigenvalues,  we  may  call  ipo  a  “ground  state” 
for  a* a,  that  is,  an  eigenvector  with  lowest  possible  eigenvalue.  We  may  then 
apply  the  raising  operator  a*  repeatedly  to  ipo  to  obtain  eigenvectors  for 
a* a  with  positive  eigenvalues. 

Theorem  11.2  If  ipo  is  a  unit  vector  with  the  property  that  aipo  =  0,  then 
the  vectors 


ipn  :=  ( a*)ntpo ,  n  >  0, 


11.2  The  Algebraic  Approach  231 


satisfy  the  following  relations  for  all  n,  m  >  0: 


&  'Ipn  V^n+l 
a*  aij)n  =  mpn 


('Ipni  'Ipm)  — 


m 


a^n+1  =  (n  +  1)?/V 


Let  us  think  for  a  moment  about  what  this  is  saying.  We  have  an  orthog¬ 
onal  “chain”  of  eigenvectors  for  a* a  with  eigenvalues  0, 1,  2, with  the 
norm  of  equal  to  Vn\.  The  raising  operator  a*  shifts  us  up  the  chain, 
while  the  lowering  operator  a  shifts  us  down  the  chain  (up  to  a  constant). 
In  particular,  the  “ground  state”  is  annihilated  by  a.  Thus,  we  have  a 
complete  understanding  of  how  a  and  a*  act  on  this  chain  of  eigenvectors 
for  a*a. 

Proof.  The  first  result  is  the  definition  of  V’n+i  and  the  second  follows 
from  Proposition  11.1  and  the  fact  that  a*aifjQ  =  0.  For  the  third  result, 
if  n  ^  m,  we  use  the  general  result  that  eigenvectors  for  a  self-adjoint 
operator  (in  our  case,  a* a)  with  distinct  eigenvalues  are  orthogonal.  (This 
result  actually  applies  to  operators  that  are  only  symmetric.) 

If  n  =  m,  we  work  by  induction.  For  n  =  0,  (V’cnV’o)  =  1  is  assumed.  If 
we  assume  (^n,^n)  =  n!,  we  compute  that 


(Vki+i ,  Vki+i )  —  {a  ipnia 

=  (fjn,aa*fjn) 

=  {'ipn ,  (a*a  +  l)VVi) 
=  (n  +  1)  (i>n,i>n) 

=  (n  +  1)!. 


Finally,  we  compute  that 

aV’n+l  =  =  («*«  +  /)  =  (n  +  l)^n, 

which  establishes  the  last  claimed  result.  ■ 

It  is  now  reasonable  to  ask  whether  the  vectors  form  an 

orthonormal  basis  for  the  quantum  Hilbert  space.  Suppose  this  is  not  the 
case.  If  we  then  let  V  denote  the  closed  span  of  the  W s,  V  will  be  invariant 
under  both  a  and  a*.  Thus,  by  elementary  linear  algebra,  the  orthogonal 
complement  V1-  of  V  will  also  be  invariant  under  the  adjoint  operators  a* 
and  a,  and  therefore  also  under  a* a.  Therefore,  we  can  begin  our  analysis 
anew  in  with  the  result  that  we  will  obtain  a  new  ground  state  </>o  G  V1- 
(satisfying  a</>o  =  0)  that  is  orthogonal  to  the  original  ground  state  If, 
then,  the  closed  span  of  the  V’n’s  is  not  the  whole  Hilbert  space,  there  will 
exist  at  least  two  independent  solutions  of  the  equation  afj  =  0.  To  put  this 
claim  the  other  way  around,  if  it  turns  out  that  there  is  only  one  solution 
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(up  to  a  constant)  of  a'lp  =  0,  then  we  expect  that  the  vectors  obtained  by 
applying  a*  repeatedly  to  the  solution  will  form  an  orthogonal  basis  for  our 
Hilbert  space.  (Because  we  are  glossing  over  various  technical  issues  having 
to  do  with  the  domains  of  various  operators,  this  conclusion  should  not  be 
regarded  as  completely  rigorous.) 


11.3  The  Analytic  Approach 

In  the  preceding  section,  we  analyzed  the  eigenvectors  of  the  operator  a* a 
as  much  as  possible  using  only  the  commutation  relation  [a,  a*]  =  /,  which 
follows  from  the  underlying  commutation  relation  [X,  P ]  =  iHI.  To  progress 
further,  we  must  recall  the  actual  formula  for  the  operators  a  and  a*. 

To  simplify  our  analysis,  let  us  introduce  the  following  natural  scale  of 
distance  for  our  problem: 


D  := 


n 


mu 


We  then  introduce  a  normalized  position  variable,  measured  in  units  of  D . 


ry»  • -  - 


X 

D’ 


(11.9) 


so  that 


d 


H  d 


dx  V  mu  dx 

A  calculation  gives  the  following  simple  expressions  for  the  raising  and 
lowering  operators: 


1 


a 


* 


a  = 


T2 

1 

71 


x  + 


x  — 


(11.10) 


Note  that  the  constants  m,  cj,  and  H  have  conveniently  disappeared  from 
the  formulas. 

Given  the  expression  in  (11.10),  we  can  easily  solve  the  (first-order,  lin¬ 
ear)  equation  aip o  =  0  as 


^o(x)  =  Ce  x  I2 . 


(ii.  n) 


If  we  take  C  to  be  positive,  then  our  normalization  condition  determines 
its  value  to  be  y/ n/D ,  by  Proposition  A. 22.  (The  normalization  condition 
is  that  the  integral  of  |^o|  with  respect  to  dx — not  dx — should  be  1.)  We 
obtain,  then, 

'  imiuj  r  mu 


Tpo(x) 


— —  exp  < - —  x 

h  l  2  h 


(11.12) 
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It  remains  only  to  apply  a*  repeatedly  to  fj o  to  get  the  “excited  states” 

'Ipn  • 


Theorem  11.3  The  ground  state  ipo  of  the  harmonic  oscillator  is  given 
by  (11.12).  The  excited  states  fjn  are  given  by 

Ipn  =  Hn  Ipo  (11.13) 


where  Hn  is  a  polynomial  of  degree  n  given  inductively  by  the  formulas 

H0(x)  =  1 

1  /  dHn(x)\ 

Hn+ l(x)  =  71  y2xHn(n  )  • 

Here,  x  is  the  normalized  position  variable  given  by  (11.9). 


The  polynomials  Hn  are  essentially  (modulo  various  normalization  con¬ 
ventions)  the  Hermite  polynomials. 

Proof.  When  n  =  0,  (11.13)  reduces  to  ipo  =  Assuming  that  (11.13) 
holds  for  some  n,  we  compute  as 


'PnJr  1  —  ®  VN  — 


1 


72  ( 


xHn(x)Ce 


-x2 /2 


d_ 

dx 


Hn(x)Ce 


-x2 /2 


1 


72  ( 


2  xHn(x)  — 


dH 


n 


dx 


Ce  *2/2  =  Hn+1(x)ip0(x), 


as  claimed.  ■ 

Figure  11.1  shows  the  ground  state  of  the  harmonic  oscillator,  along  with 
the  excited  states  with  n  =  5  and  n  =  30.  Each  eigenfunction  is  plotted  as 
a  function  of  the  normalized  position  variable  x.  In  each  case,  the  shaded 
region  indicates  the  extent  of  the  classically  allowed  region,  that  is,  the 
range  in  which  a  classical  particle  with  energy  En  can  move.  Note  that 
each  wave  function  decays  rapidly  outside  the  classically  allowed  region. 
In  the  last  image,  we  can  see  that  frequency  of  oscillation  of  the  wave 
function  is  greatest  in  the  middle  of  the  classically  allowed  region,  while  the 
amplitude  of  the  wave  function  is  greatest  near  the  ends  of  the  classically 
allowed  region.  Intuitively,  these  properties  of  the  wave  function  reflect  that 
a  classical  particle  with  energy  En  has  largest  momentum  in  the  middle  of 
the  classically  allowed  region  (where  the  potential  is  smallest)  and  that  the 
classical  particle  spends  more  time  at  the  ends  of  the  classically  allowed 
region,  since  it  is  moving  slowest  there.  Further  development  of  this  sort  of 
reasoning  may  be  found  in  Chap.  15. 


11.4  Domain  Conditions  and  Completeness 

Although  the  analysis  in  Sect.  11.2  is  typical  of  what  is  found  in  physics 
texts,  it  is  not  completely  rigorous  from  a  mathematician’s  point  of  view. 
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FIGURE  11.1.  Harmonic  oscillator  eigenvectors  with  n  =  0,  n  =  5,  and  n  —  30. 
In  each  case,  the  classically  allowed  region  is  shaded. 


The  main  problem  is  that  the  lowing  operator  a,  the  raising  operator  a*, 
and  the  product  operator  a* a  are  all  unbounded  operators.  The  difficulty 
in  working  with  unbounded  operators  is  that  one  constantly  has  to  check 
that  a  vector  is  in  the  domain  of  the  relevant  operator  before  applying  that 
operator.  For  example,  suppose  we  have  a  vector  ipo  in  the  domain  of  a  and 
satisfying  a^o  =  0.  We  wish  to  apply  the  raising  operator  a*  to  ipo  and  we 
then  want  to  argue  that 


a*a(a*^o)  =  a*  ip  q. 

This  is  easy  enough  to  verify  (as  we  did  in  the  previous  section)  provided 
that  all  vectors  are  in  the  domain  of  the  relevant  operators.  But  how  do 
we  know  that  ipo  is  in  the  domain  of  a*?  And  even  if  it  is,  how  do  we  know 
that  a*^o  is  in  the  domain  of  a* a? 
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These  concerns  are  not  just  theoretical.  Consider  a  general  pair  of 
operators  A  and  B  satisfying  [A,  B]  =  ihl.  If  we  try  to  analyze  an  op¬ 
erator  of  the  form  aA 2  +  [3B 2,  for  a, /?  >  0,  by  the  methods  of  Sect.  11.2, 
things  can  easily  go  awry,  as  the  counterexample  in  Sect.  12.2  demonstrates. 
Fortunately,  in  the  case  of  the  ordinary  position  and  momentum  operators, 
the  putative  eigenfunctions  fjn  for  a*  a  in  Theorem  11.3  are  very  nice  func¬ 
tions,  in  the  form  of  a  polynomial  times  a  Gaussian.  Thus,  there  is  no 
difficulty  in  verifying  that  these  functions  are  in  the  domain  of  any  finite 
product  of  creation  and  annihilation  operators.  It  follows  that  if  a  and  a* 
are  given  in  terms  of  the  usual  position  and  momentum  operators  and 
given  by  (11.12),  the  relations  in  Theorem  11.2  indeed  hold. 

In  particular,  we  can  see  that  the  s  form  an  orthogonal  set  of  functions 
in  L2(M).  Showing  that  they  form  an  orthogonal  basis  is  also  not  terribly 
difficult. 


Theorem  11.4  The  functions 

1pn(  X)  =  Hn(x)lp  0(x) 

muj  \  I  nmuj 

—■ v)  V^rexp 


mo;  | 
2  h  / 


form  an  orthogonal  basis  for  the  Hilbert  space  L2(R). 

The  following  result  is  the  key  to  the  proof. 

Lemma  11.5  For  all  a  E  C;  the  partial  sums  of  the  series 


oo 

E 

n= 0 


anxn  z.2 


x/2 


n\ 


converge  in  L2(R )  to  the  function  eaxe 


cxx  p  —  x2 /2 


Proof.  We  need  to  show  that 


N 

rax-x2/2  aUXn  -x2/2 

Z-  n! 

n= 0 


oo 


E 

n=N- (-1 


anxn  ~ 2 12 

n! 


dx  (11.14) 


tends  to  zero  as  N  tends  to  infinity.  The  integrand  on  the  right-hand  side 
of  (11.14)  tends  to  zero  pointwise.  If  we  can  find  a  suitable  dominating 
function,  we  can  use  dominated  convergence  to  conclude  that  the  integral 
also  tends  to  zero.  We  see  that 


E 


n=N+l 


anxn 


2 


< 


ax 


n 


n\ 


=  e 


2\a\  \x\ 


-x2 


2 
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Since  this  last  function  certainly  has  finite  integral,  dominated  convergence 
applies  and  we  are  done.  ■ 

Proof  of  Theorem  11.4.  It  is  easily  seen  that  the  raising  and  lower¬ 
ing  operators  map  the  Schwartz  space  «S(R)  (Definition  A.  15)  into  itself. 
Furthermore,  it  is  easy  to  verify  (Exercise  1)  that 


for  all  <p,  ip  G  <S(R).  From  this,  we  can  easily  verify  that  for  all  <p,  ip  e  <S(R), 
and  so  also 

(</>,  a*aip)  =  {a*a<p,ip)  . 

It  is  evident  that  both  the  ground  state  ipo  and  all  the  excited  states  ipn 
occurring  in  Theorem  11.4  belong  to  <S(R).  Thus,  the  proof  of  Theorem  11.2 
is  indeed  valid.  We  conclude,  then,  that  the  ipn’s  form  an  orthogonal  set  of 
vectors  in  L2(M)  and  that  they  are  eigenvectors  for  H  with  the  indicated 
eigenvalues. 

It  remains  to  show  that  the  ipn’ s  form  an  orthogonal  basis  for  L2(M).  Let 
V  denote  the  space  of  finite  linear  combinations  of  the  ipnA.  Since  Hn  is  a 
polynomial  of  degree  n,  it  is  easily  seen  that  V  consists  precisely  functions 
of  the  form 

ip(x)  =  p(x)e~x  //2, 

where  p  is  a  polynomial. 

Lemma  11.5  then  shows  that  ezkxe^x  / 2  belongs  to  the  L2-closure  of  V 
for  all  k  G  R.  Thus,  if  ip  is  orthogonal  to  every  element  of  V,  we  have 

f  e-lkxe-x2/2pj(x)  dx  =  0  (11.15) 

Jr 

for  all  k.  Now,  since  e~x  /2  belongs  to  L°°(M)  H  L2(M)  and  ip  belongs  to 
L2(M),  their  product  belongs  to  L2(M)  D  L1(M).  Thus,  (11.15)  tells  us  that 
the  L2  Fourier  transform  of  e~x  l2rp[x)  is  identically  zero.  Thus,  e~x  !2rp[x) 
must  be  the  zero  element  of  L2(M),  by  the  Plancherel  theorem,  and  so 
ip(x)  =  0  almost  everywhere.  This  shows  that  V1-  =  {0},  meaning  that  V 
is  dense  in  L2(M).  ■ 


11.5  Exercises 

1.  Show  that  for  any  Schwartz  functions  (p  and  ip,  we  have 

(p,aip)  =  (a* p,  ip) , 

as  expected. 

Hint :  Use  integration  by  parts  on  the  interval  [— A ,  A]  and  show  that 
the  boundary  terms  tend  to  zero  as  A  tends  to  infinity. 
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2.  Show  that  the  polynomials  Hn  satisfy  the  following  relations: 


Hn—i  (y) 


l 


n 


V2 


K(v) 


and 


Hn+i(y)  =  2=  (2 yHn{y)  -  nV2Hn_i{y) 


Hint :  Start  with  the  relation  aipn  =  mpn-i. 

3.  Establish  the  following  Rodrigues  formula  for  the  polynomials  H 


n 


Hn{y)  =  (— 1)"2-"/2 


dy  '  e  V 

e-y* 


4.  In  this  exercise,  we  prove  the  following  claim:  The  polynomial  Hn  has 
n  distinct  real  zeros  and  the  zeros  of  Hn  “interlace”  with  the  zeros  of 
Hn- 1,  meaning  that  there  is  exactly  one  zero  of  Hn-\  between  each 
pair  of  consecutive  zeros  of  Hn. 


(a)  Verify  the  claim  for  H 1  and  Hq. 

(b)  Assume,  inductively,  that  Hn  and  Hn-\  have  distinct  real  zeros 
and  that  the  zeros  interlace.  Show  that  Hn-\  alternates  in  sign 
at  consecutive  zeros  of  Hn.  Then  show  that  i4n+i  and  Hn-\  have 
opposite  signs  at  each  zero  of  Hni  so  that  i7n+i  also  alternates 
in  sign  at  consecutive  zeros  of  Hn.  Conclude  that  i4n+i  must 
have  at  least  one  zero  between  each  pair  of  consecutive  zeros 
of  Hn. 

Hint :  Use  Exercise  2. 

(c)  Show  that  i4n+i  and  Hn_  1  have  the  same  sign  near  Too  but 
opposite  signs  at  the  largest  and  smallest  zeros  of  Hn.  Conclude 
that  Hn+ 1  has  at  least  one  zero  below  the  smallest  zero  of  Hn 
and  at  least  one  zero  above  the  largest  zero  of  Hn. 

(d)  Conclude  that  Hn+ 1  has  n  +  1  real  zeros  that  interlace  with  the 
zeros  of  Hn. 


5.  Let  fjn 


^Pn/  tAi 


be  the  normalized  nth  excited  state. 


(a)  Let  X  =  X/D,  where  D  =  (h/muj)1/2 .  Show  that 


Hint :  Express  X  in  terms  of  a  and  a*,  using  (11.10),  and  then 
use  Theorem  11.2. 
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(b)  Show  that 


h(n+  1/2)  \  1/2 


rriuj 


If  T  and  V  denote  the  kinetic  energy  and  potential  energy  terms, 
respectively,  in  (11.3),  show  that 


The  Uncertainty  Principle 


In  this  chapter,  we  will  continue  our  investigation  of  the  consequences  of 
the  commutation  relations  among  the  position  and  momentum  operators. 
We  will  mostly  consider  a  particle  in  R1,  where  we  have 

[x,p\  =  m.  (12.1) 

/\ 

We  have  already  seen  that  much  of  the  analysis  of  the  Hamiltonian  H 
for  the  quantum  harmonic  oscillator  (given  by  CiP2  +  C2X2)  can  be  car¬ 
ried  out  using  only  the  commutation  relation  (12.1).  There  are  two  other 
main  results  that  can  be  derived  from  these  commutation  relations:  the 
Heisenberg  uncertainty  principle  and  the  Stone-von  Neumann  theorem. 
The  uncertainty  principle  states  that  the  product  of  the  uncertainty  in  X 
and  the  uncertainty  in  P  cannot  be  smaller  than  fi/ 2.  The  Stone-von  Neu¬ 
mann  theorem,  meanwhile,  states  that  any  two  self-adjoint  operators  A 
and  B  satisfying  [H,  B]  =  iHI  Nook  like”  several  copies  of  the  standard 
position  and  momentum  operators  acting  on  L2(M).  Both  results  are  true 
only  under  certain  technical  domain  conditions,  which  we  will  need  to  ex¬ 
amine  carefully.  We  discuss  the  uncertainty  principle  in  this  chapter  and 
the  Stone-von  Neumann  theorem  in  the  next  chapter. 

The  uncertainty  principle  states  that  for  all  ip  in  L2  (R)  satisfying  certain 
domain  conditions,  we  have 

(A^X)(A ,*P)  >  ^ 

where,  for  any  observable  A ,  we  let  A ^A  denote  the  “uncertainty”  in  mea¬ 
surements  of  A  in  the  state  ip  (Definition  3.13).  This  means  that  one  cannot 
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make  both  the  uncertainty  in  position  and  the  uncertainty  in  momentum 
arbitrarily  small  in  the  same  state  ip. 

Although  we  can  easily  make  A^X  as  small  as  we  want  simply  be  taking 
ip  to  be  supported  in  a  small  interval,  if  we  do  that,  A^P  will  be  large. 
Similarly,  we  can  make  A^P  as  small  as  we  like,  by  taking  the  momentum 
wave  function  pj(p)  (Sect.  6.6)  to  be  supported  in  a  small  interval,  but 
then  A^X  will  get  large.  In  the  idealized  limit  in  which  the  position  wave 
function  is  concentrated  at  a  single  point,  ippx)  would  be  a  multiple  of 
S(x  —  a)  for  some  a,  in  which  case,  the  momentum  wave  function  ip(p) 
would  be  a  multiple  of  e~zpa/h.  In  that  case,  |^(p)|2  is  constant,  meaning 
that  the  momentum  wave  function  is  completely  spread  out  over  the  whole 
real  line. 

This  uncertainty  principle  may  be  interpreted  as  saying  that  it  is  impos¬ 
sible  to  simultaneously  measure  the  position  and  momentum  of  a  quantum 
particle.  After  all,  we  have  said  (Axiom  4)  that  if  we  perform  a  measure¬ 
ment  of  an  observable  A  with  a  discrete  spectrum,  then  immediately  after 
the  measurement  the  state  ip  of  the  system  should  be  an  eigenvector  for  A. 
If  A  has  a  continuous  spectrum,  this  principle  is  replaced  by  the  require¬ 
ment  that  after  the  measurement,  the  uncertainty  in  A  should  very  small. 
If  we  could  measure  both  the  position  and  the  momentum  of  the  parti¬ 
cle  simultaneously  with  arbitrary  precision,  then  after  the  measurement, 
both  AX  and  A P  would  have  to  be  very  small,  violating  the  uncertainty 
principle. 

Now,  on  the  scale  of  everyday  life,  Planck’s  constant  is  very  small.  If, 
for  example,  we  measure  mass  in  units  of  grams,  distance  in  units  of  cen¬ 
timeters,  and  time  in  units  of  seconds,  then  h  has  the  numerical  value  of 
1.054  x  10  27 .  Thus,  on  “macroscopic”  scales  of  energy  and  momentum,  it 
is  possible  for  the  uncertainties  in  position  and  momentum  both  to  be  very 
small.  But  on  the  atomic  scale,  the  uncertainty  principle  puts  a  substan¬ 
tial  limitation  on  how  localized  the  position  and  momentum  of  a  particle 
can  be. 

In  Sect.  12.1,  we  prove  a  version  of  the  uncertainty  principle  for  any  two 
operators  A  and  B  satisfying  [A,  B }  =  ihl ,  under  a  seemingly  innocuous 
assumption  on  the  domains  of  the  operators  involved.  In  Sect.  12.2,  how¬ 
ever,  we  see  that  the  domain  assumptions  are  not  so  innocuous  after  all. 
In  that  section,  we  encounter  two  operators  satisfying  [A,  B]  =  ihl  on  a 
dense  subspace  of  the  Hilbert  space,  along  with  a  vector  pj  such  that  the 
uncertainty  in  A  is  finite  and  the  uncertainty  in  B  is  zero.  The  existence 
of  such  a  vector  is  surely  contrary  to  the  spirit  of  the  uncertainty  princi¬ 
ple,  even  though  it  does  not  violate  the  version  of  the  uncertainty  principle 
proved  in  Sect.  12.1.  (The  vector  pj  in  Sect.  12.2  does  not  satisfy  the  domain 
assumptions  of  Theorem  12.4.)  Finally,  in  Sect.  12.3,  we  show  that  for  the 
usual  position  and  momentum  operators  on  L2(M),  no  such  counterexam¬ 
ples  occur:  If  A^X  and  A^P  are  both  defined,  then  (Ay, A) (A y,P)  >  h/ 2. 
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In  this  section,  it  is  essential  that  we  make  sure  that  all  vectors  are  in 
the  domains  of  the  various  operators  we  want  to  apply  to  these  vectors. 
With  this  concern  in  mind,  we  make  the  following  definition.  (Compare 
Definition  9.36.) 

Definition  12.1  If  A  and  B  are  unbounded  operators  on  H,  define  AB  to 
be  the  operator  with  domain 

Dom (AB)  =  {fi  G  Dom(7>)  | Bfi  E  Dom(A) } 

and  given  by  (AB)fi  =  A(B'ijj). 

Even  if  Dom(A)  and  Dom(7>)  are  dense  in  H,  it  could  happen  that 
Dom(AB)  is  not  dense  in  H. 

Recall  (Definition  3.13)  that  the  uncertainty  of  a  symmetric  operator  A 
in  a  state  is  defined  to  be 

(A  =  (12.2) 

As  written,  this  definition  requires  that  belong  to  the  domain  of  ( A  — 
(A)^  7)2,  which  is  the  same  as  the  domain  of  A2.  However,  since  we  assume 
that  A  is  symmetric,  then  (A)^  =  (fi>,Afi>)  is  real,  so  that  A  —  (A)^I  is 
again  symmetric.  Thus,  (12.2)  can  be  rewritten  as 

(A^Af  =  ((. A  -  MU  (A  -  MU  m)  • 

Having  written  the  uncertainty  in  this  way,  it  is  natural  to  extend  the 
definition  of  uncertainty  to  vectors  that  belong  only  to  Dom(A),  as  follows. 

Definition  12.2  If  A  is  a  symmetric  operator  on  H,  then  for  all  unit 
vectors  in  Dom(A),  the  uncertainty  A^A  of  A  in  the  state  is  given 
by 

(A ipA?  =  (m  -  mu  M  -  MU  W)  •  (12-3) 

By  expanding  out  the  right-hand  side  of  (12.3),  we  see  that  the  uncer¬ 
tainty  may  also  be  computed  as 

(A <M)2  =  MV’MV’)  -  (VMM)2- 

[Compare  (3.24).]  Of  course,  if  happens  to  be  in  the  domain  of  A2,  then 
Definition  12.2  agrees  with  (12.2). 

Proposition  12.3  If  A  is  a  symmetric  operator  on  H,  then  for  all  unit 
vectors  E  Dom(A),  we  have  A^A  =  0  if  and  only  if  is  an  eigenvector 
for  A. 
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Proof.  If  A^A  =  0,  then  from  (12.3),  we  see  that  (. A  —  (A)^I)ip  =  0, 
meaning  that  ip  is  an  eigenvector  for  A  with  eigenvalue  (A)^  .  Conversely,  if 
Aip  =  A  ip  for  some  A,  then  (ip,  Aip)  =  A  (ip,  ip)  =  A.  Thus,  (A—(A)^  I)ip  =  0, 
which,  by  (12.3),  means  that  A ^A  =  0.  ■ 

As  discussed  in  the  introduction  to  this  chapter,  we  expect  that  imme¬ 
diately  after  a  measurement  of  an  observable  A,  the  state  of  the  system 
will  have  very  small  uncertainty  for  A.  Indeed,  if  A  has  discrete  spectrum, 
we  expect  that  the  state  of  the  system  will  be  an  eigenvector  for  A.  Even 
in  the  case  of  a  continuous  spectrum,  we  expect  that  the  uncertainty  in 
A  can  be  made  as  small  as  one  wishes,  by  making  more  and  more  precise 
measurements.  Suppose  now  that  one  wishes  to  observe  simultaneously  two 
(or  more)  different  observables,  represented  by  operators  A  and  B.  In  the 
case  of  a  discrete  spectrum,  the  system  after  the  measurement  should  be 
simultaneously  an  eigenvector  for  A  and  an  eigenvector  for  B.  In  the  case 
where  A  and  B  commute,  this  idea  is  reasonable.  There  is  a  version  of 
the  spectral  theorem  for  commuting  self-adjoint  operators;  in  the  case  of 
discrete  spectrum,  it  says  that  two  commuting  self-adjoint  operators  have 
an  orthonormal  basis  of  simultaneous  eigenvectors  with  real  eigenvalues. 
(In  the  case  of  unbounded  operators,  there  are,  as  usual,  technical  domain 
conditions  in  defining  what  it  means  for  two  self-adjoint  operators  to  com¬ 
mute.) 

In  the  case  where  A  and  B  do  not  commute,  they  do  not  need  to  have  any 
simultaneous  eigenvectors.  Certainly,  A  and  B  cannot  have  an  orthonormal 
basis  of  simultaneous  eigenvectors,  or  they  would  in  fact  commute.  The  lack 
of  simultaneous  eigenvectors  suggests,  then,  that  it  is  simply  not  possible 
to  make  a  simultaneous  measurement  of  two  self-adjoint  operators  unless 
they  commute.  In  standard  physics  terminology,  the  quantities  A  and  B 
are  said  to  be  “incommensurable,”  meaning  not  capable  of  being  measured 
at  the  same  time.  (See  Exercise  2  for  a  classification  of  the  simultaneous 
eigenvectors  of  a  representative  pair  of  noncommuting  operators.) 

In  the  case  of  a  continuous  spectrum,  the  notion  of  an  eigenvector  is 
replaced  by  the  notion  of  a  state  with  very  small  uncertainty  for  the  relevant 
operator.  In  light  of  our  discussion  of  simultaneous  eigenvectors,  we  may 
expect  that  for  noncommuting  operators,  it  may  be  difficult  to  find  states 
where  the  uncertainties  of  both  operators  are  small.  This  expectation  is 
realized  in  the  following  version  of  the  uncertainty  principle. 

Theorem  12.4  Suppose  A  and  B  are  symmetric  operators  and  ip  is  a  unit 
vector  belonging  to  Dom (AB)  D  Dom {BA).  Then 


(A^Af(A^B)2  > 


1 

4 


2 


<[AB]>^ 


(12.4) 


Note  that  if  ip  E  Dom  (AB)  then  in  particular,  ip  E  Dom(T>),  and  if 
ip  E  Dom  (BA)  then  ip  E  Dom(A).  Thus,  the  assumptions  on  ip  are  sufficient 
to  guarantee  that  A^A  and  A^B  make  sense  as  in  Definition  12.2. 
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Proof.  Define  operators  A!  and  B'  by  A!  :=  A  —  (fj,  Afj)  I  and  B'  := 
B  —  ipf,  Bip)  I.  (We  use  the  same  domains  for  A'  and  B'  as  for  A  and 
B,  and  it  is  easily  verified  that  A'  and  B'  are  still  symmetric  on  those 
domains.)  Then  by  the  Cauchy-Schwarz  inequality,  we  obtain 


(A'^.A'^j)  (B'fj,B'fj)  > 

>  lm(Af'ip,B,'ip) 


=  \  It4 -  (B'ip,A'ipy2 


(12.5) 

(12.6) 

(12.7) 


The  assumptions  on  fj  guarantee  that  Bfj  E  Dom(Fl)  and  hence  also  that 
B'lp  E  Dom(7l/),  and  similarly  with  A!  and  B'  reversed.  Since  A '  and  B' 
are  symmetric,  we  may  rewrite  (12.7)  as 


(A'tMV)  (B'fj,B'fj)  > 


1 

4 

1 

4 


(fj,A'B'fj)  -  (iPtB'A'iP) 


(il>,[A',B']il>) 


2 


Now,  since  the  identity  operator  commutes  with  everything,  the  commu¬ 
tator  of  A!  and  B'  is  the  same  as  the  commutator  of  A  and  B.  Furthermore, 
(A'ip,Arip)  is  nothing  but  (A pA)2  and  similarly  for  B.  Thus,  we  obtain 


(A^A)2(A^B)2  >  1 


which  is  what  we  wanted  to  prove.  ■ 

We  now  specialize  Theorem  12.4  to  the  case  in  which  the  commutator  is 
ihl  and  take  the  square  root  of  both  sides. 


Corollary  12.5  Suppose  A  and  B  are  symmetric  operators  satisfying 


[. A ,  B]  =  ihl 

on  Dom(AB)  D  Dom(FM).  Then  if  E  Dom(AB)  D  Dom (BA)  is  a  unit 
vector ,  we  have 

{A^A)(A^B)  >  (12.8) 

In  particular,  for  all  unit  vectors  E  L2(R)  in  Dom(XP)  nDom(PV),  we 
have 

(A^X)(A ^P)  >  |  (12.9) 

Note  that  the  factor  of  h  appearing  on  the  right-hand  side  of  (12.8)  is  re¬ 
ally  just  \(ip,  [ A ,  B]ip)  |  .  Since,  however,  ip  is  a  unit  vector  and  [ A ,  B]  =  ihl , 
drops  out  of  the  right-hand  side  of  our  inequality.  We  see  then  that  both 
sides  of  (12.9)  make  sense  whenever  A^X  and  A^P  make  sense,  namely, 
whenever  ip  belongs  to  Dom(X)  and  to  Dom(P).  (Recall  Definition  12.2.) 
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On  the  other  hand,  the  proof  that  we  have  given  for  (12.9)  requires  if  to 
be  in  both  Dom(JP)  and  Dom(PX).  Nevertheless,  it  is  natural  to  ask 
whether  (12.9)  holds  for  all  if  in  Dom(X)  D  Dom(P).  We  may  similarly 
ask  whether  (12.8)  holds  for  all  if  in  Dom(A)  D  Dom(P).  As  we  will  see  in 
Sects.  12.2  and  12.3,  the  answer  to  the  first  question  is  yes  and  the  answer 
to  the  second  question  is  no. 

Meanwhile,  it  is  of  interest  to  investigate  “minimum  uncertainty  states,” 
that  is,  states  if  for  which  the  inequality  (12.4)  is  an  equality. 

Proposition  12.6  If  A  and  B  are  symmetric  and  if  is  a  unit  vector  in 
Dom (AB)  D  Dom (PA),  equality  holds  in  (12.4)  if  and  only  if  one  of  the 
following  holds:  (1)  if  is  an  eigenvector  for  A,  (2)  if  is  an  eigenvector  for 
B,  or  (3)  if  is  an  eigenvector  for  an  operator  of  the  form 

A  —  ijB 


for  some  nonzero  real  number  7. 

In  the  case  A  =  X  and  B  =  P,  we  will  consider  examples  where  equality 
holds  in  Sect.  12.4. 

Proof.  To  get  equality  in  (12.4),  we  must  have  equality  in  both  (12.5) 
and  (12.6).  Equality  in  (12.5)  occurs  if  and  only  if  A' if  =  0  or  B'lf  =  0  or 
A! if  =  cB'if  for  some  nonzero  constant  c.  If  A' if  is  zero,  if  is  an  eigenvector 
for  A  with  eigenvalue  (A)^  .  In  that  case,  equality  holds  in  (12.6)  as  well. 
Conversely,  if  if  is  an  eigenvector  for  A  with  some  eigenvalue  A,  then  (A)  ^  = 
A  and  A' if  =  0.  Similarly,  B' if  =  0  if  and  only  if  if  is  an  eigenvector  for  B. 

Meanwhile,  suppose  A' if  and  B' if  are  nonzero  and  A' if  =  cB'if ,  so  that 
equality  holds  in  (12.5).  Then  equality  holds  (12.6)  if  and  only  if  c  =  for 
some  nonzero  7  E  R.  Thus,  when  A' if  and  B' if  are  nonzero,  we  get  equality 
in  (12.4)  if  and  only  if 

A'if  =  i'yB'if  (12.10) 

for  some  nonzero  real  number  7.  Recalling  the  definition  of  A!  and  B' , 
(12.10)  says  that 

(. A  —  (if,  Aif)  I)  if  =  ij(B  —  (if,  Bif)  I)  if  (12.11) 


or 

(A  —  i'yB)if  =  \if,  (12.12) 

where  A  =  (if,  Aif)  —  i 7  (if,  Bif) . 

Thus,  if  (12.11)  holds,  if  is  an  eigenvector  of  A  —  i^B.  Conversely,  if  if 
is  an  eigenvector  for  A  —  i^B  with  some  eigenvalue  A  =  c  +  id  in  C,  then 


(c  +  id)  || if 


2 


(ip,  (A  -  i^B)'ip) 


(ip,  Atp)  —  *7  (ip,Bip). 


(12.13) 


Since  A  and  B  are  assumed  to  be  symmetric  and  if  is  a  unit  vector,  we 
may  equate  real  and  imaginary  parts  in  (12.13)  to  obtain 


c  =  (ip,  Aip ) ;  d=-  7  (ip,  Btp) . 
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From  this  we  can  see  that  (12.11)  and  (12.10)  hold,  and  thus  equality  holds 
in  (12.4).  ■ 


12.2  A  Counterexample 

In  this  section,  we  consider  the  Hilbert  space  L2[— 1,1].  As  our  “position” 
operator,  we  use  the  usual  formula, 

A'ip(x)  =  xpj(x). 

Note  that  A  is  a  bounded  operator,  because  we  restrict  x  to  the  bounded 
interval  [—1, 1].  As  such,  A  is  defined  (and  self-adjoint)  on  the  whole  Hilbert 
space  L2(M).  As  our  “momentum”  operator,  we  again  use  the  usual  formula, 


B  =  —ih 


d_ 

dx 


As  the  domain  of  B  we  will  take  the  space  of  continuously  differentiable 
functions  ip  on  [—1,1]  satisfying  the  periodic  boundary  condition , 


^(-1)  =  "0(1). 


(12.14) 


To  verify  that  B  is  symmetric,  note  that  for  any  C1  functions  <fi  and  'i!k 
we  have 


[  <p(x)  ^  dx  =  <p(l)pj(l)  —  <p(— l)Vd— 1)  —  [  -f-^(x)  dx. 

J dx  J  _  i  dx 

If  both  <p  and  ip  satisfy  the  periodic  boundary  condition  (12.14),  the  bound¬ 
ary  terms  cancel  out  to  zero.  This  shows  that  the  operator  d/dx  is  skew- 
symmetric  on  Dom(H),  from  which  it  follows  that  —ihd/dx  is  symmetric 
on  Dom(H).  Actually,  since  the  functions 

^n(x)  :=  -^-=eKinx ,  n  E  Z,  (12.15) 

v  2 

constitute  an  orthonormal  basis  of  eigenvectors  for  B  with  real  eigenvalues, 
B  is  essentially  self-adjoint,  by  Example  9.25. 

Now,  for  all  ip  E  Dom (AB)  D  Dom (BA)  we  have,  by  direct  calculation, 

ABpj  —  BAip  =  i/40,  (12.16) 

just  as  for  the  usual  position  and  momentum  operators.  Furthermore, 
Dom(AH)  n  Dom  {BA)  is  dense  in  H,  since  it  contains  all  continuously 
differentiable  functions  ip  such  that  -0(0)  =  ^(1)  =  0.  Consider,  now,  the 
function  ipn(x)  in  (12.15),  for  some  integer  n.  Clearly,  ipn  is  in  the  domain 
of  F>,  since  Bipn  is  just  a  multiple  of  p)n.  Since  ipn  is  an  eigenvector  for  F>, 


246 


12.  The  Uncertainty  Principle 


the  uncertainty  of  B  in  the  state  ifn  is  zero!  Meanwhile,  since  A  is  bounded, 
the  uncertainty  of  A  is  well  defined  and  finite.  Thus,  A lfJnA  and  A ^nB  are 
both  unambiguously  defined  and 


(A^)(A^nB)=0.  (12.17) 

How  can  (12.17)  hold?  Is  it  not,  in  light  of  (12.16),  a  violation  of  (12.8) 
in  Corollary  12.5?  The  answer  is  no,  for  the  reason  that  ipn  does  not  satisfy 
the  domain  assumptions  in  that  corollary.  Specifically,  Afjn  is  not  in  the 
domain  of  P,  since  Afjn  is  does  not  satisfy  the  periodic  boundary  condition 
in  the  definition  of  Dom(P).  Thus,  fjn  does  not  belong  to  Dom (BA). 

Although  it  does  not  contradict  Corollary  12.5,  (12.17)  certainly  violates 
the  spirit  of  the  uncertainty  principle.  In  the  next  section,  we  will  show 
that  no  such  strange  counterexamples  occur  for  the  usual  position  and 
momentum  operators. 
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In  this  section,  we  will  see  that  if  A  and  B  are  taken  to  be  the  usual 
position  and  momentum  operators  X  and  P,  the  uncertainty  principle  holds 
whenever  A^X  and  A^P  are  defined.  We  continue  to  use  Definition  12.2 
for  the  definition  of  the  uncertainty  in  any  operator,  in  which  case,  for 
A^X  and  A^P  to  be  defined,  we  require  only  that  fj  belong  to  Dom(X) 
and  Dom(P). 

We  are  now  ready  to  formulate  the  strong  version  of  the  uncertainty 
principle. 

Theorem  12.7  Suppose  P  is  a  unit  vector  in  L2(M)  belonqinq  to  Dom(X)n 
Dom(P).  Then 


(A^X)(A pP)  >  (12.18) 

where  A^X  and  A^P  are  given  by  Definition  12.2. 

Proof.  According  to  Stone’s  theorem  and  Example  10.16,  the  operator  P 
is  h  times  the  infinitesimal  generator  of  the  group  U (•)  of  translations.  That 
is  to  say,  for  ah  fj  E  Dom(P),  we  have 


(Pip)(x) 


—ih  lim 

CL — ^0 


+  a)  —  if(x) 


a 


12.3  Uncertainty  Principle,  Second  Version  247 


where  the  limit  is  in  the  L 2  norm  sense.  Thus, 


(Xp,  Pp)  =  lim  (  Xp,  —ih 

a— >■  0 


p{x  +  a)  —  p(x) 


a 


/ 1  'ifi 

=  lim  (  -  (xp(x):  —ihpix  +  a))  H - (Xp,  p) 

o  ya  a 

/ 1  ij% 

=  lim  -  (ih(y  -  a)ip(y  -  a),  ip(y))  H - {Xip,  $) 

a^o  \a  a 

where  in  the  last  step  we  have  made  the  change  of  variable  y  =  x  +  a. 
If  we  rename  the  variable  of  integration  back  to  r,  we  get 


+  ih  (p(x  —  a),  p{x)) 
+  ih  (p(x  —  a),  p(x)) 


(12.19) 


In  the  second  equality,  we  have  used  that  X  is  symmetric  and  that  (check) 
if  p  E  Dom(X),  then  p(x  —  a)  E  Dom(X)  for  each  fixed  a.  In  the  last 
equality,  we  get  a  minus  sign  from  having  p(x  —  a)  —  p(x)  rather  than 
p(x  +  a)  —  p(x),  and  we  use  that  translation  is  strongly  continuous. 

It  should  be  noted  that  (12.19)  is  precisely  what  we  would  get  by  formally 
moving  X  to  the  right-hand  side  of  the  inner  product,  using  the  commuta¬ 
tion  relation  XP  —  PX  =  ihl ,  and  then  moving  P  to  the  left-hand  side  of 
the  inner  product.  But  to  make  that  calculation  rigorous,  we  would  need  to 
assume  that  p  is  in  the  domain  of  XP  and  the  domain  of  PX.  In  (12.19), 
on  the  other  hand,  we  have  obtained  the  desired  conclusion  assuming  only 
that  p  is  in  the  domain  of  X  and  in  the  domain  of  P. 

Having  obtained  (12.19),  we  can  easily  verify  that  for  any  real  constants 
a  and  /?,  we  have 


((X  -  al)p,  (P  -  fil)p)  =  ((P  -  fiI)P,  (X  -  +  ih  (P,  P) .  (12.20) 

Solving  (12.20)  for  (p,p)  gives 

(V  ^  «(*  -  (C  -  W)  -  <(p  -  W,  (X  -  al)ip)) 

=  %  Im  ((X  -  aiu,  (P  -  /3I)ip) 
h 

<  |||(X-a/)^||||(P-  m\\,  (12.21) 

by  the  Cauchy-Schwarz  inequality.  If  p  is  a  unit  vector  and  we  take  a  = 
(X)^  ,  and  /?  =  (P)^,  then  || (X  -  al)ip\\2  =  (A^,X)2  and  ||(P  -  0I)ip ||2  = 
(A pP)2.  Thus,  we  get 
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1  <  ^(A1pX)(A1pP), 

which  is  equivalent  to  what  we  want  to  prove.  ■ 

We  know  from  Sect.  12.2  that  the  strong  form  of  the  uncertainty  principle 
does  not  hold  if  X  and  P  are  replaced  by  two  arbitrary  operators  satisfying 
AB  —  BA  =  ihl  on  Dom(AP)  nDom(PA),  even  if  Dom(AP)  nDom(PA)  is 
dense  in  H.  Nevertheless,  if  we  look  carefully  at  the  proof  of  Theorem  12.7, 
we  can  see  what  assumptions  we  would  need  on  A  and  B  to  make  the  proof 
go  through  in  a  more  general  setting. 

Theorem  12.8  Suppose  A  and  B  are  self-adjoint  operators  on  H.  Suppose 
that  for  all  a  E  R  and  ip  E  Dom(A),  we  have  that  eiaBip  belongs  to  Dom(A) 
and  that 

AeiaBip  =  eiaB Aip  -  haeiaBip.  (12.22) 

Then  for  all  unit  vectors  ip  in  Dom(A)  D  Dom(T>),  we  have 

(A1pA)(A1pB)  >  | 

where  A ^A  and  A^B  are  defined  by  Definition  12.2. 

The  relation 

eiaBA  =  AeiaB  +  haeiaB ,  a  el,  (12.23) 

which  holds  on  Dom(A),  is  a  “semi-exponentiated”  form  of  the  canonical 
commutation  relations.  As  shown  in  Exercise  6,  there  is  a  formal  argument 
(ignoring  domain  issues)  that  the  commutation  relations  [A,  B]  =  ihl  ought 
to  imply  the  relations  (12.22).  Nevertheless,  as  Exercise  7  shows,  this  formal 
argument  does  not  always  give  the  correct  conclusion.  In  Sect.  14.2,  we 
will  encounter  a  “fully  exponentiated”  form  of  the  canonical  commutation 
relations,  in  which  both  A  and  B  are  exponentiated. 

Proof.  See  Exercise  5.  ■ 

Corollary  12.9  For  any  j  =  1, . . .  n  and  any  unit  vector  ip  E  L2(Mn)  with 
ip  E  Dom (Xj)  n  Dom(Pj),  we  have 


(A^)(A ^Pj)  >  | 

Proof.  In  the  case  that  A  =  Xj  and  B  =  P3 ,  we  have  (eiaB /hip)px)  = 
V’(x-baej),  by  Exercise  2  in  Chap.  10.  Thus,  in  this  case,  (12.22)  says  that 

(xj  +  a)'0(x  +  aej)  =  Xjippz  +  ae3)  +  a^(x  +  ae^), 


which  is  true.  ■ 
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12.4  Minimum  Uncertainty  States 

In  this  section,  we  look  at  the  states  that  give  equality  in  the  uncertainty 
principle.  Such  states  are  known  as  minimum  uncertainty  states  or  coher¬ 
ent  states.  As  in  the  general  setting  of  Proposition  12.6,  the  condition  for 
a  equality  is  an  eigenvector  condition.  That  is  to  say,  even  though  in  The¬ 
orem  12.7,  we  allow  -0’s  that  are  not  Dom(AP)  D  Dom(PA),  we  do  not 
get  any  new  minimum  uncertainty  states  by  this  weakening  of  our  domain 
assumptions. 

Proposition  12.10  A  unit  vector  ip  E  Dom(A)  D  Dom(P)  satisfies 

(A^X)(A  *P)  =  ^ 

if  and  only  if  ip  satisfies 

{X  +  iSP)pj  =  Mp  (12.24) 

for  some  nonzero  real  number  5  and  some  complex  number  A. 

For  convenience,  we  have  made  the  substitution  S  =  —7  in  (12.24)  rela¬ 
tive  to  Proposition  12.6. 


Re[ydT)] 


FIGURE  12.1.  Minimum  uncertainty  state  with  ( X )  =  1,  ( P )  =  0,  and 
AX  =  1/2. 


Proof.  All  the  relations  in  the  proof  of  Theorem  12.7  are  equalities,  except 
for  the  inequality  in  the  last  line  of  (12.21).  Equality  will  hold  in  that  line 
if  and  only  if  one  of  (X  —  al)  ip  and  (P  —  ft  I)ip  is  zero  or  {P  —  (3 1)ip  is  a 
pure-imaginary  multiple  of  (A  —  al)ip.  Now,  if  ip  is  a  unit  vector  in  L2(M), 
then  neither  fj  nor  the  Fourier  transform  of  ip  can  be  supported  at  a  single 
point;  thus,  neither  (A  —  al)ip  nor  (P  —  can  be  zero.  We  are  left, 

then,  with  the  condition  that 

(A  -  al)pj  =  z7(P  -  pity, 


(12.25) 
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Re[y^  pc)] 


FIGURE  12.2.  Minimum  uncertainty  state  with  ( X )  =  1,  ( P )  =  10,  and 
AX  =  1/2. 

where  7  is  a  nonzero  real  number,  a  =  (A)^  and  / 3  =  (B)^  .  As  in  the 
proof  of  Proposition  12.6,  (12.25)  is  equivalent  to  the  assertion  that  is 
an  eigenvector  for  the  operator  X  —  i^P.  Letting  S  =  —7  gives  the  desired 
result.  ■ 


Proposition  12.11  If  the  parameter  S  in  (12.24)  ^  negative,  there  are 
no  nonzero  solutions  to  (12.24)-  If  the  parameter  S  is  positive,  there  exists 
a  unique  (up  to  multiplication  by  a  constant)  solution  'ips, x  t°  (12.24)  for 
every  complex  number  X.  The  function  x  has  the  following  additional 
properties 


(X)  =  Re  A 

(P)~- 


1 

7  Im  A 
0 


AX 

AP 


5. 


Explicitly,  we  have 


VA \{x)  =  ci  exp  j- 


=  C2  exp 


(x  —  A)2 
2  Sh 

(*  -  (X))2 

2  Sh 


exp 


i  ( P )  x 

h 


where  all  expectation  values  are  taken  in  the  state  Vbx 

Note  that  among  states  with  (AX)(AP)  =  h/ 2,  we  can  arrange  for 
AX/ AP  to  be  any  positive  real  number,  and  once  we  have  chosen  AX/ A P, 
we  can  then  arrange  for  (X)  and  (P)  to  be  any  two  real  numbers.  On  the 
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FIGURE  12.3.  Minimum  uncertainty  state  with  ( X )  =  1,  ( P )  =  20,  and  AX  =  1. 

other  hand,  once  AX/ A P  and  (X)  and  ( P )  have  been  specified,  there  is  a 
unique  quantum  state  with  (AX) (A P)  =  ft/2.  In  Figs.  12.1-12.3,  we  have 
plotted  the  real  part  of  vps,x  for  several  different  values  of  the  parameters, 
in  a  system  of  units  for  which  ft  =  1. 

Proof.  The  equation  ( X  +  i5P)ip  =  \ip  amounts  to 

xip  +  Sh^-  =  Xip(x),  (12.26) 

ax 

where  ip  is  assumed  to  be  in  the  domain  of  P,  so  that  the  distributional 
derivative  of  pj  is  an  L2  function.  If  ip  were  smooth,  then  the  unique  solu¬ 
tion  to  (12.26)  would  be  the  function  pjs,x  given  in  the  proposition,  which 
is  square-integrable  if  and  only  if  S  >  0.  Even  (12.26)  is  only  assumed 
to  hold  in  the  distribution  sense,  the  argument  in  the  proof  of  Proposi¬ 
tion  9.29  (with  e  x/hrp(x)  replaced  by  exp[(x  —  A)2/ (2 Sh)\pj(x))  shows  that 
there  are  no  additional  solutions.  The  formulas  for  (X) ,  ( P ) ,  and  AX/ A P 
can  be  computed  either  by  tracing  through  the  arguments  in  the  proof  of 
Theorem  12.7  or  by  direct  calculation  with  the  formula  for  ,ips,x-  ■ 


12.5  Exercises 

1.  Let  a  be  a  positive  real  number.  Show  that  the  following  “additive” 
version  of  the  uncertainty  principle  holds  for  all  unit  vectors  ip  E 
Dom(X)  n  Dom(P)  : 

aA^X  -| — A jjP  >  V2 ft. 

a 

2.  In  this  exercise,  we  classify  the  simultaneous  eigenvectors  of  the  non- 

/\  /\  ^  ^ 

commuting  operators  J\  and  J2.  Let  Ji,  J2,  and  J3  denote  the  angular 
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momentum  operators  on  L2(M3)  as  defined  in  Sect.  3.10.  Suppose  p 
is  in  the  domain  of  any  product  JjJk  of  two  angular  momentum  op¬ 
erators.  (For  example,  p  could  be  a  Schwartz  function.)  Suppose  also 
that  p  is  an  eigenvector  for  J\  and  for  J2  with  eigenvalues  a  and  /3, 
respectively. 

(a)  Using  the  commutation  relations  in  Exercise  10  in  Chap.  3,  show 

/\ 

that  p  is  an  eigenvector  for  J3  with  eigenvalue  0. 

(b)  Show  that  the  eigenvalues  a  and  /?  for  J\  and  J2  must  be  zero. 

(c)  What  type  of  function  p  E  L  2(M3)  satisfies  Jjp  =  0  for  j  = 
1,2,3? 

3.  Given  any  unit  vector  p  E  Dom(X)  D  Dom(P),  consider  another 
vector  <p  given  by 

p(x)  =  elhx^hp{x  —  a). 

Show  that  <p  is  a  unit  vector  belonging  to  Dom(X)  n  Dom(P)  and 
that 

(X)4>  =  (X)^+a 

AfiX  =  A^X 

and 

(P)4,  =  (P)^+b 

A  —  ApP. 


4.  We  have  seen  that  a  unit  vector  p  E  Dom(X)nDom(P)  is  a  minimum 
uncertainty  state  [i.e. ,  (A^X)(ApP)  =  h/ 2]  if  and  only  if  there  exists 
some  S  >  0  such  that  p  is  an  eigenvector  of  the  operator  X  +  iSP. 
In  that  case,  p  is  also  an  eigenvector  for  any  operator  of  the  form 
c(X  +  iSP),  with  c  being  a  nonzero  constant.  Consider,  then,  some 
fixed  S  >  0  and  define  an  operator  a  by  the  formula 


a 


Ux  +  isp) 

y/2  h/S 


Then  a  is  just  the  annihilation  operator,  as  defined  in  Chap.  11,  for  a 
harmonic  oscillator  with  muj  =  1/8.  Thus,  a  and  its  adjoint  a*  satisfy 
the  relation  [a,  a*]  =  /,  and  we  have  the  “chain”  of  eigenvectors 
pn  E  L2(M)  satisfying  the  properties  fisted  in  Theorem  11.2. 


(a)  For  any  A  E  C,  find  constants  cn  so  that  the  vector 

00 

<j>\  ■=  2_j  Cn^n 
n= 0 

is  an  eigenvector  for  a  with  eigenvalue  A.  Show  that  the  resulting 
series  converges  in  H. 
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(b)  Let  (j)\  denote  the  eigenvector  obtained  in  Part  (a),  normalized 
so  that  Co  =  1.  Show  that 

=  eXa  <j>o, 


where  the  exponential  is  defined  by 


e 


x  n 

E  rr  <“•)>«■ 

n= 0 


with  convergence  in  L2(M). 

5.  Prove  Theorem  12.8,  following  the  outline  of  the  proof  of  Theo¬ 
rem  12.7.  Recall  from  Sect.  10.2  that  B/h  is  the  infinitesimal  gen¬ 
erator  of  the  one-parameter  unitary  group  /7(a)  :=  eiaB^h. 

6.  If  X  and  Y  are  bounded  operators,  we  may  define  adx(T)  =  [X,  F], 
where  [X,  Y]  =  XY  —  YX.  Thus,  say,  (adx)3(F)  =  [X,  [X,  [X,  Y]]\. 
It  is  not  hard  to  show  that  for  any  bounded  operators  Y  and  X,  we 
have 

exYe“x  =  eadx  (Y) 

[X,[X,r\]  ,  [X,[X,[X,Y]]]  , 

2!  +  3!  +"'- 

(12.27) 


=  Y  +  [X,  Y]  + 


(See  Proposition  2.25  and  Exercise  2.19  of  [21].) 

Suppose  A  and  B  are  unbounded  self-adjoint  operators  satisfying 
[X,  B]  =  ihl  on  Dom(X7>)  D  Dom {BA).  Show  that  if  we  could  ap¬ 
ply  (12.27)  with  X  =  iaB/h  and  Y  =  A  (even  though  X  and  Y  are 
unbounded),  then  A  and  B  would  satisfy  (12.22). 

7.  Let  A  be  the  operator  in  Sect.  12.2,  and  let  B  be  the  unique  self- 
adjoint  extension  of  the  operator  B  in  that  section.  Show  that  the 
operators  X  =  iaB/h  and  Y  =  A  do  not  satisfy  (12.27). 

Note:  This  result  shows  the  hazards  involved  formally  applying  results 
for  bounded  operators  to  unbounded  operators. 

Hint :  Show  that  the  unitary  operators  /7(a)  :=  exp  (iaB/h)  consist 
of  “translation  with  wrap  around,”  first  on  the  eigenvectors  of  B  and 
then  on  the  whole  Hilbert  space. 


13 

Quantization  Schemes  for  Euclidean 
Space 


13.1  Ordering  Ambiguities 

One  of  the  axioms  of  quantum  mechanics  states,  “To  each  real-valued 
function  /  on  the  classical  phase  space  there  is  associated  a  self-adjoint 
operator  /  on  the  quantum  Hilbert  space.”  The  attentive  reader  will  note 
that  we  have  not,  up  to  this  point,  given  a  general  procedure  for  con- 

/V  /\ 

structing  /  from  /.  If  we  call  /  the  quantization  of  /,  then  we  have  only 
discussed  the  quantizations  of  a  few  very  special  classical  observables,  such 
as  position,  momentum,  and  energy. 

Let  us  now  think  about  what  would  go  into  quantizing  a  (more-or-less) 
general  observable.  Let  us  consider  for  simplicity  a  particle  moving  in  M1 
and  let  us  assume  that  quantizations  of  x  and  p  are  the  usual  position 
and  momentum  operators  X  and  P.  What  should  the  quantization  of,  say, 
xp  be?  Classically,  xp  and  px  are  the  same,  but  quantum  mechanically, 
XP  does  not  equal  PX.  Furthermore,  neither  XP  nor  PX  is  self-adjoint, 
because  (XP)*  =  P*X*  =  PX,  and  PX  ^  XP.  In  this  case,  then,  a 
reasonable  candidate  for  the  quantization  would  be 

xp=  \{XP  +  PX). 

The  significance  of  this  simple  example  is  that  the  failure  of  commuta¬ 
tivity  among  quantum  operators  creates  an  ambiguity  in  the  quantization 
process.  It  does  not  make  sense  to  simply  “replace  x  by  X  and  p  by  P 
everywhere  in  the  formula,”  since  the  ordering  of  position  and  momen¬ 
tum  makes  no  difference  on  the  classical  side,  but  it  does  on  the  quantum 
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side.  Up  to  this  point,  we  have  not  really  had  to  confront  this  ambiguity, 
because  of  the  special  form  of  the  observables  we  have  quantized.  The 
Hamiltonian,  for  example,  is  typically  of  the  form  H(x,p)  =  p2 / (2m)  + 
V(x).  Since  each  term  contains  only  x  or  only  p,  it  is  natural  to  quantize 
H  to  H  =  P2 / (2m)  +  V(X),  where  V ( X )  may  be  defined  by  the  functional 
calculus  or  simply  as  multiplication  by  V(x).  In  defining  the  angular  mo¬ 
mentum  operators,  we  do  encounter  products  of  position  and  momentum, 
but  never  of  the  same  component  of  position  and  momentum.  For  a  parti¬ 
cle  in  M2,  for  example,  we  have,  J  =  X1P2  —  X2P1 .  On  the  quantum  side, 
X\  commutes  with  P2  and  X2  with  P2,  and  thus  there  is  no  ambiguity: 
X1P2  —  X2P1  is  the  same  as  P2X1  —  P1X2. 

When  we  turn  to  the  quantization  of  a  general  observable,  however, 
we  must  confront  the  ordering  ambiguity  directly.  Groenewold’s  theorem 
(Sect.  13.4)  suggests  that  there  is  no  single  “perfect”  quantization  scheme. 
Nevertheless,  there  is  one  that  is  generally  acknowledged  as  having  the  best 
properties,  the  Weyl  quantization,  and  we  spend  most  of  our  time  with 
that  particular  scheme.  Other  quantization  schemes  do  also  play  a  role  in 
physics,  however;  Wick-ordered  quantization,  notably,  plays  an  important 
role  in  quantum  field  theory.  (In  quantum  field  theory,  the  replacement  of 
certain  Weyl-quantized  operators  with  their  Wick-quantized  counterparts 
is  interpreted  as  a  type  of  renormalization.) 


13.2  Some  Common  Quantization  Schemes 

In  this  section,  we  consider  several  of  the  most  commonly  used  quantization 
schemes.  For  simplicity,  we  limit  our  attention  to  systems  with  one  degree 
of  freedom  and  to  classical  observables  that  are  polynomials  in  x  and  p. 
(We  consider  the  Weyl  quantization  in  greater  generality  in  Sect.  13.3.) 
Furthermore,  we  resolve  in  this  section  not  to  worry  about  domain  questions 
and  simply  to  use  C%°  (M)  as  the  domain  for  all  of  our  operators.  Thus, 
in  this  section,  equality  of  operators  means  equality  as  maps  of  C%°(M)  to 
itself.  It  should  be  noted  that  the  operators  of  the  sort  we  will  be  considering 
may  very  well  fail  to  be  essentially  self-adjoint,  even  if  they  are  symmetric. 
Section  9.10  shows,  for  example,  that  the  operator  P2  —  cX4,  for  c  > 
0,  is  not  essentially  self-adjoint  on  We  follow  the  terminology  of 

harmonic  analysis  by  referring  to  a  classical  symbol  /  as  the  symbol  of  its 

/\ 

quantization  /.  Once  we  have  discussed  each  quantization  scheme  briefly, 
we  will  formalize  the  definitions  of  all  the  schemes  in  Definition  13.1. 

The  simplest  approach  to  quantization  is  to  choose,  once  and  for  all, 
which  to  put  first,  the  position  or  the  momentum  operators.  We  may,  for 
example,  choose  to  put  the  momentum  operators  to  the  right,  acting  first, 
and  the  position  operators  to  the  left,  acting  second.  In  this  approach,  a 
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polynomial  in  x  and  p  will  quantize  to  a  differential  operator  in  “standard 
form,”  with  all  the  derivatives  acting  first,  followed  by  multiplication  oper¬ 
ators.  In  harmonic  analysis,  there  is  a  method  for  extending  this  quantiza¬ 
tion  scheme  to  more-or-less  arbitrary  symbols,  /.  For  a  general  (nonpoly- 
nomial)  symbol  /,  the  resulting  operator  /  is  known  as  a  pseudodifferential 
operator. 

A  serious  drawback  of  the  pseudodifferential  quantization  is  that  even 
when  the  symbol  /  is  real-valued,  the  operator  /  it  produces  is  typically 
not  self-adjoint  (or  even  symmetric).  If,  for  example,  f(x,p)  =  xp ,  then  the 
associated  operator  is  XP,  the  adjoint  of  which  is  PX,  which  is  not  equal 
to  XP.  The  simplest  way  to  fix  this  problem  is  to  symmetrize  the  operator 
by  taking  half  the  sum  of  the  operator  and  its  adjoint. 

The  Weyl  quantization,  meanwhile,  takes  more  seriously  the  possibility 
of  different  orderings  of  X  and  P,  by  considering  all  possible  orderings. 
Thus,  in  quantizing,  say,  x2p 2,  the  Weyl  quantization  will  give 

1  (X2P2  +  XPXP  +  XP2X  +  PX2P  +  PXPX  +  P2X2). 

For  a  general  monomial,  the  Weyl  quantization  similarly  averages  all  the 
possible  orderings  of  the  position  and  momentum  operators. 

For  Wick-ordered  and  anti-Wick-ordered  quantization,  we  no  longer 
regard  the  position  and  momentum  operators  as  the  “basic”  operators, 
but  rather  the  creation  and  annihilation  operators.  Specifically,  given  any 
positive  real  number  a,  we  introduce  complex  coordinates  on  the  classical 
phase  space  by 


z  =  x  —  iap 

z  =  x  +  iap.  (13.1) 

(Although  it  would  seem  more  natural  to  define  z  to  be  x  +  iap ,  this 
choice  would  lead  to  problems  later,  especially  with  the  Segal-Bargmann 
transform.)  We  then  consider  the  corresponding  quantum  operators,  which 
we  call  the  raising  and  lowering  operators: 

a*  =  X  —  iaP 

a  =  X  +  iaP.  (13.2) 

In  comparing  these  operators  to  the  ones  defined  in  the  context  of  the 
harmonic  oscillator,  we  should  think  of  a  as  corresponding  to  l/{muo). 
Even  with  this  identification,  however,  the  operators  in  (13.2)  differ  by  a 
constant  from  the  raising  and  lowering  operators  of  Chap.  11.  [The  over¬ 
all  normalization  of  the  raising  and  lowering  operators  is  not  important 
in  this  context,  provided  that  we  are  consistent  in  the  normalization  be¬ 
tween  (13.1)  and  (13.2).]  In  particular,  the  commutator  of  a  and  a*  is  not 
I  but  rather  2 ahL 
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In  Wick-ordered  quantization,  we  begin  by  expressing  the  classical 
observable  /  in  terms  of  z  and  z  rather  than  in  terms  of  x  and  p.  When  we 
quantize,  we  put  all  the  lowering  operators  (coming  from  the  factors  of  z 
in  /)  to  the  right,  acting  first,  and  the  raising  operators  (coming  from  the 
factors  of  z  in  /)  to  the  left,  acting  second.  This  approach  to  quantization  is 
useful  in  quantum  field  theory,  where  letting  the  lowering  operators  act  first 
can  cause  certain  otherwise  ill-defined  expressions  to  become  well  defined. 
In  anti-Wick-ordered  quantization,  we  do  the  reverse,  putting  the  raising 
operators  to  the  right,  acting  first.  Although  anti- Wick-ordered  quantiza¬ 
tion  seems  singular  in  the  context  of  quantum  field  theory,  in  systems  with 
finitely  many  degrees  of  freedom,  it  is  actually  better  behaved  than  Wick- 
ordered  quantization. 

Definition  13.1  Define  several  different  quantization  schemes  for  symbols 
that  are  polynomials  in  x  and  p  as  follows.  Each  scheme  is  uniquely 
determined — as  a  map  from  polynomials  on  M2  into  operators  on  Cf°  (R)  — 
by  the  indicated  formulas. 

1.  Pseudodifferential  operator  quantization : 

Q{xjpk)  =  XWk. 


2.  Symmetrized  pseudodifferential  operator  quantization : 

Q(xjpk)  =  1  (XjPk  +PkXj). 

3.  Weyl  quantization: 


1 


Q(xjpk)  =  j- pW  E 


ctGS 


j  +  k 


where  for  any  operators  Ai,  A2, . . . ,  An  and  any  a  E  5n,  we  define 


^"(^-1?  ^-2  ?  •  •  •  5  Apfj  -^-cr(l)  ^cr(2)  "  ’  "  ^-er(n) 


(13.3) 


4.  Wick- ordered  quantization  with  parameter  a: 


Q((x  +  iap)3  (x  —  iap)k)  =  {X  —  iaP)k(X  +  iaP)3 ,  a  >  0 


k 


5.  Anti- Wick- ordered  quantization  with  parameter  a: 


Q((x  +  iap)3  (x  —  iap)k)  =  {X  +  iaP)3  (X  —  iaP)1* ,  a  >  0 


k 


In  applications,  the  most  useful  quantization  schemes  are  the  Wick- 
ordered,  anti-Wick-ordered,  and  Weyl  schemes.  All  of  the  quantization 
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schemes  in  Definition  13.1  except  the  pseudodifferential  operator  quantiza¬ 
tion  have  the  property  of  mapping  real-valued  polynomials  to  symmetric 
operators  on  (See  Exercise  3  in  the  case  of  the  Wick-  and  anti- 

Wick-ordered  quantizations.) 

In  comparing  the  different  quantization  schemes,  it  is  important  to  rec¬ 
ognize  that  two  different  expressions  may  describe  the  same  operator.  We 
may  calculate,  for  example,  that 

I  (XP2  +  P2X)  =  1  (PXP  +  [X,  P]P  +  PXP  -  P[X,  P ]) 

=  PXP, 


since  [X,  P]  is  a  multiple  of  the  identity  and  thus  commutes  with  P.  As  a 
result,  we  can  eliminate  the  PXP  term  in  the  Weyl  quantization  of  xp2 , 
with  the  result  that 

Cweyl(V)  =  \{XP2  +  PXP  +  P2X)  =  1(XP2  +  P2X),  (13.4) 

which  coincides,  in  this  very  special  case,  with  the  symmetrized  pseudod¬ 
ifferential  quantization  of  xp2 . 

Example  13.2  If  f(x,p)  =  x2 ,  then  the  Weyl,  Wick-ordered  and  anti- 
Wick- ordered  quantizations  of  f  are  as  follows: 


Qwey\{%2)  =  X2 

l 

Qwick(^2)  =  X2  —  -ahl 

l 

Qanti  — Wick(^  )  —  X  T  —CxHI. 

Proof.  The  value  for  Qweyi(^2)  is  apparent.  To  compute  the  Wick-  and 
anti-Wick-ordered  quantizations,  we  first  write  x  as  (z  +  z)j 2,  so  that 

9  "T  ^0  1  ,  9  _  _9  \ 

x2  —  - =  -(z2  +  2zz  -)-  z2). 

4  4 

Thus,  we  have,  for  example, 


Qwick(x2)  =  P(X  -  iaP)2 


o  ( v~ 


When  we  expand  this  expression  out,  the  P2  terms  cancel,  and  the  XP 
and  PX  terms  from  (X  —  iaP)2  will  cancel  with  the  XP  and  PX  terms 
from  (X  +  iaP)2 .  Thus,  we  will  be  left  with  X2  terms  and  the  XP  and 
PX  terms  from  the  cross-term  above: 


QwickOr2)  =  \  (4X2  +  2 ia[X,  P})  . 
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Using  the  commutation  relation  between  X  and  P  gives  the  desired  result. 
The  calculation  of  QantiWick(^2)  is  identical  except  that  the  order  of  the 
factors  in  the  cross-term  is  reversed,  which  gives  the  opposite  sign  for  the 
[X,  P]  term.  ■ 

Proposition  13.3  The  Weyl  quantization — viewed  as  a  linear  map  of  the 
space  of  polynomials  on  M2  into  operators  on  Cf°  (R)  — is  uniquely  charac¬ 
terized  by  the  following  identity: 

Qweyi((«^  +  bp)j)  =  (aX  +  bP)j  (13.5) 

for  all  non-negative  integers  j  and  all  a,  b  E  C. 

Proof.  The  Weyl  quantization  is  easily  seen  to  satisfy  the  identity 


Qweyl((«lX  +  bip)  '  *  '  {fLjX  +  bjp)) 

=  T7  ^  ^  (j{ji\X  -f-  b\P^ . . . ,  ajX  -\-  b j P ) , 


(13.6) 


for  all  sequences  cq, . . . ,  aj  and  &i, . . . ,  bj  of  complex  numbers,  where  the 
expression  <r(-,  •,...,•)  is  defined  by  (13.3).  Specializing  to  the  case  where  all 
the  s  are  equal  to  a  and  all  the  by s  are  equal  to  b  gives  (13.5).  Conversely, 
suppose  that  Q  is  any  linear  map  of  polynomials  into  operators  on  (R) 
satisfying  Q({ax  +  bp)i)  =  (aX  +  bP)J  for  all  a,  6,  and  j.  For  each  j,  let 
Vj  denote  the  space  of  homogeneous  polynomials  /  of  degree  j  such  that 
Q(f)  =  Qweyi(f)-  Then  Vj  contains  all  polynomials  of  the  form  ( ax-\-bp)y 
and  thus,  by  Exercise  1,  Vj  consists  of  all  homogeneous  polynomials  of 
degree  j,  so  that  Q  =  Qweyi-  ■ 


Proposition  13.4  The  Weyl  quantization  satisfies 


ih  f  dg 

Qwey\(xg)  —  Qwey\(x)Qwey\(g) - ^“Qweyl  (  7^ 

ih  f  dg 

=  Q Weyl  {g)Q Weyl (x)  +  —  Qweyl  (  ^ 


and 


Qwey\{pg) 


ih 

Qwey\{p)Qwey\(g)  +  ~  Qweyl 

ih 

Q Weyl  (g)Q' Weyl  (p) - ^“Qweyl 


dg_ 

dx 

dg_ 

dx 


for  all  polynomials  g  in  x  and  p. 


(13.7) 

(13.8) 


(13.9) 

(13.10) 


It  should  be  noted  that  the  formulas  for  the  Weyl  quantization  in  Propo¬ 
sition  13.4  may  not  give  the  same  “expression”  for  Qweyiif)  as  does 
Definition  13.1,  but  it  does  give  the  same  operator.  [Compare  (13.4).] 
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Proof.  Suppose  A  =  (a\X  +  b\P)  and  B  =  (c 12X  A  62 -P)-  Then  [ A ,  B]  is  a 
multiple  of  /,  from  which  we  can  easily  verify  that 

ABj  =  BkABj~k  +  k[A,B]Bj~1, 

for  0  <  k  <  j.  If  we  sum  this  relation  over  k  and  divide  by  j  A  1,  we  obtain 
ABj  =  BkABj~k  +  - —^^  +  1\a,  B]Bj-\  (13.11) 

^  1  k=0  ^ 

Now,  A  is  the  Weyl  quantization  of  (a\X  Ab\p)  and  B 3  is  the  Weyl  quanti¬ 
zation  of  (a2X  +  b2py ,  and  both  terms  on  the  right-hand  side  of  (13.11)  are 
easily  recognized  as  Weyl  quantizations.  Thus,  after  rearranging  the  terms 
and  evaluating  the  commutator,  (13.11)  becomes, 

Qwey\((aix  +  b1p)(a2x  A  b2p)J) 

=  Qwey\(dlX  +  bip)QyKey\((a2X  +  b2p)J) 

-  ih^-{aib2  -  a2^i)Qweyi((^ix  +  bip)^1).  (13.12) 

Meanwhile,  if  we  run  the  same  argument  starting  with  BJ  A  we  obtain  a 
similar  result: 


Qweyl((«lX  +  b1p)(a2x  A  b2p)3) 

=  Qweyl((<T>T  +  b2p)J)Qwey\(aiX  A  hp) 

+  ih^(aib2  -  CL2bi)Qwey\((aix  A  bip)J_1).  (13.13) 

If  we  specialize  to  the  case  (ai,  bi)  =  (1,0)  and  (<22, 62)  =  (a,  b),  we  get 

Qwey\(x(ax  A  bp)3)  =  Q Weyl (^)Q' Weyl ((dX  A  bp)3) 

-  ih^bQWeyi((ax  A  bp )J_1),  (13.14) 

where  the  last  term  on  the  right-hand  side  of  (13.14)  is  —ih/ 2  times  the 
Weyl  quantization  of  d(axAbp)3 /dp.  Thus,  (13.14)  is  precisely  (13.7)  in  the 
case  g(x,p)  =  (ax  A  bp)3 .  We  can  then  see  from  Exercise  1  that  (13.7)  hold 
for  all  polynomials  g.  The  proofs  of  (13.8),  (13.9),  and  (13.10)  are  similar. 


13.3  The  Weyl  Quantization  for  R2n 

In  this  section,  we  study  the  Weyl  quantization  on  a  much  larger  class  of 
symbols  (i.e.,  classical  observables)  than  the  polynomial  symbols  considered 
in  the  previous  section.  We  also  generalize  from  symbols  defined  on  M2  to 
symbols  defined  on  M2n. 
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13.3.1  Heuristics 


It  is  a  straightforward  matter  to  extent  the  Weyl  quantization  on 
polynomials  from  M2  to  M2n.  This  extended  quantization  will  satisfy 

Qweyi((a  •  p  +  b  •  p)J)  =  (a  •  X  +  b  •  P)-7  (13.15) 

for  all  a,  b  E  Mn  and  all  non- negative  integers  j,  as  in  Proposition  13.3  in 
the  n  =  1  case.  Suppose  we  wish  to  extend  Qweyi  to  certain  nonpolynomial 
symbols,  starting  with  complex  exponentials.  If  we  multiply  (13.15)  by 
(iy  / j\  and  sum  on  j,  we  would  expect  to  have 

Qweyi  (ei(a'x+b  p))  =  ei(a'x+b'p).  (13.16) 

Now,  if  /  is  any  sufficiently  nice  function  on  M2n,  we  can  expand  /  as  an 
integral  involving  functions  of  the  form  exp(z(a  •  x  +  b  •  p)),  by  using  the 
Fourier  transform: 

/(x,  p)  =  (271-)"”  [  /(a,  b)ei(a  x+b  p)  da  db, 

J  R2™ 

where  /  is  the  Fourier  transform  of  /.  In  light  of  (13.16),  it  is  then  natural 
to  define 


Qweyi(/)  =  (2?r)  71  [  /(a,b)e*(a  X+b'p)  da  db.  (13.17) 

J  R2™ 


Before  proceeding,  let  us  pause  for  a  moment  to  compute  the  operator 
exp(z(a-X  +  b  -P)).  If  A  and  B  are  bounded  operators  that  commute  with 
their  commutator  (i.e. ,  such  that  [A,  [^4,5]]  =  [ B ,  [A,  5]]  =  0),  then 


eA+B  =  e-[A,B]/2eAeB' 


(13.18) 


(See  Theorem  14.1,  which  is  proved  in  Sect.  3.1  of  [21].  Equation  (13.18)  is 
a  special  case  of  the  Baker-Campbell-Hausdorff  Formula.)  If  we  formally 
apply  (13.18)  with  A  =  i a  •  X  and  B  =  ib  •  P  (even  though  these  are 
unbounded  operators),  we  obtain 


i(a-X+b-P)  _  i^(a-b)/2  ia-X  ib'P 

F  F  F  F 


(13.19) 


Meanwhile,  by  Example  10.16  in  Sect.  10.2,  we  know  that 

(eib-p^)(x)  =y(x  +  ab). 


Thus,  we  may  reasonably  hope  that 


(13.20) 


In  general,  we  get  incorrect  results  if  we  formally  apply  results  for  bounded 
operators  to  operators  that  are  unbounded.  In  this  case,  however,  the  result 
of  the  formal  calculation  is  correct.  The  simplest  way  to  prove  this  is  to 
replace  a  and  b  by  ta  and  th  on  the  right-hand  side  of  (13.19)  and  to  check 
that  the  result  is  a  strongly  continuous  one-parameter  unitary  group. 
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Proposition  13.5  For  all  a  and  b  in  Mn,  the  operators  C/a,b(^)  on  F2(Mn) 
given  by 

(ea,b(t)^)(x)  =  eit2H^b)/2eit^  (x  +  thW)  (13.21) 

form  a  strongly  continuous  one-parameter  unitary  group.  The  infinitesimal 
generator  of  this  group  coincides  with  a  •  X  +  b  •  P  on  Cf°(Rn)  and  is 
essentially  self-adjoint  on  this  domain.  Thus,  if  a  •  X  +  b  •  P  denotes  the 
unique  self-adjoint  extension  of  the  infinitesimal  generator  on  Cf°(Rn),  it 
follows  from  Stone’s  theorem  that 

^it{ a-X+b-P)  ^it2  h(a-h) /2 ^ith-'P 


for  all  t  G  R.  In  particular,  (13.19)  and  (13.20)  hold. 

Proof.  It  is  apparent  that  /7ab  is  unitary  for  each  a  and  b,  and  it  is  a 
simple  direct  computation  to  show  that  it  is  indeed  a  unitary  group.  Strong 
continuity  is  proved  in  the  usual  way  using  a  dense  subspace,  as  in  the  proof 
of  Example  10.12.  When  ip  is  in  Cf°  (Mn),  it  is  easy  to  differentiate  the  right- 
hand  side  of  (13.21)  with  respect  to  t  at  t  =  0  to  obtain  the  formula  for  the 
infinitesimal  generator.  Finally,  the  essential  self-adjointness  of  a  •  X  +  b  •  P 
on  is  precisely  the  content  of  Proposition  9.40.  ■ 

With  the  computation  of  the  operator  eha  X+b  P)  *n  panc[?  we  return  to 
our  analysis  of  the  proposed  formula  (13.17)  for  the  general  Weyl  quan¬ 
tization.  If  the  Fourier  transform  of  /  is  in  L1(M2n),  we  can  regard  the 
right-hand  side  of  (13.17)  as  an  absolutely  convergent  “Bodmer”  integral 
with  values  in  the  Banach  space  6(H).  For  our  purposes,  however,  it  is 
more  convenient  to  think  of  operators  on  L2(Mn)  as  integral  operators  and 
to  write  down  a  formula  for  the  integral  kernel  of  Qwey\(f)  in  terms  of  / 
itself.  (But  see  Exercise  7.) 

At  a  formal  level,  the  operator  mapping  ip  to  e^(a-b)/2e*a-x^  (x  -f-  hh) 
may  be  thought  of  as  an  “integral”  operator,  with  integral  kernel  given  by 

etft(a-b)/2eia.x5n(x  +  hh_  y);  (13.22) 

where  Sn  is  an  n-dimensional  delta-function  (the  n-dimensional  analog  of 
the  distribution  in  Example  A. 26).  Thus,  it  should  be  possible  to  obtain  the 
integral  kernel  of  Qweyi(f)  by  integrating  the  preceding  expression  against 
/(a,  b).  To  evaluate  the  resulting  integral,  we  make  the  change  of  variable 
c  =  hh,  in  which  case  we  obtain 


(27 tH)  n  f  f  el(a'^/2eia'*5n(x  +  c  —  y)/(a,  c/ti)  dc  da 

Jr n  Jr n 

=  (2t Th)~n  [  d(a'(y-x))/2da'x/(a,  (y  —  x)/A)  da 

Jr n 


=  h-n(2ir)-n/2 


{2ir)~n/2  /  eia'(x+y)/2/(a,  (y  -  x)/h)  da 


)  n 


(13.23) 


264 


13.  Quantization  Schemes  for  Euclidean  Space 


We  may  recognize  the  integral  in  square  brackets  in  the  last  line  of  (13.23) 
as  undoing  the  Fourier  transform  of  /  in  the  x-variable,  leaving  us  with  the 
partial  Fourier  transform  of  /  in  the  p  variable,  evaluated  at  the  points  (x+ 
y)/2,  (y  —  x)/h.  (The  partial  Fourier  transform  means  the  ordinary  Fourier 
transform  with  respect  to  one  of  the  variables,  with  the  other  variable 
fixed.)  Thus,  we  expect  that  Qwey\(f)  should  be  the  integral  operator  with 
integral  kernel  Hf  given  by 

K/(x,y)  =  (27T h)-n  [  /((x  +  y)/2,  p)e-i(y-x-)'p/fi  dp.  (13.24) 

J  Rn 

13.3.2  The  L 2  Theory 

With  the  preceding  calculations  as  motivation,  we  now  define  Qwey\(f)  to 
be  the  integral  operator  with  kernel  kj,  beginning  with  the  case  in  which 
/  belongs  to  L2(M2n).  The  resulting  operators  will  turn  out  to  be  Hilbert- 
Schmidt  operators  on  L2(Mn). 

If  H  is  a  Hilbert  space  and  A  £  B{  H)  is  a  non- negative  self-adjoint 
operator  on  H,  then  it  can  be  shown  that  A  has  a  well-defined  (but  possibly 
infinite)  trace.  What  this  means  is  that  the  value  of 

trace(H)  :=  (ej,Aej) 

j 

is  the  same  for  each  orthonormal  basis  {ej}  of  H.  Note  that  since  A  is  a 
non-negative  operator,  (ej,Aej)  is  a  non-negative  real  number,  so  that  the 
sum  is  always  defined,  but  may  have  the  value  Too. 

Now,  if  A  is  any  bounded  operator,  then  A* A  is  self-adjoint  and  non¬ 
negative.  We  say  that  A  is  Hilbert-Schmidt  if 

trac  e(A*  A)  <  oo. 

Given  two  Hilbert-Schmidt  operators  A  and  F>,  it  can  be  shown  that  A*  B 
is  a  trace-class  operator,  meaning  that  the  sum 

oo 

trace(H*F>)  :=  (ej,A*Bej) 

3= 1 

is  absolutely  convergent  and  the  value  of  the  sum  is  independent  of  the 
choice  of  orthonormal  basis.  We  define  the  Hilbert-Schmidt  inner  product 
of  A  and  B  and  the  associated  Hilbert-Schmidt  norm  of  A  by 

(A,  H)hs  :=  trace(H*F>) 

||H||hs  :=  y/ trace(H*H). 

It  can  be  shown  that  the  space  of  Hilbert-Schmidt  operators  on  H  forms  a 
Hilbert  space  with  respect  to  the  Hilbert-Schmidt  inner  product. 
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(See  Sect.  19.2  for  more  details.)  We  denote  the  space  of  Hilbert-Schmidt 
operators  on  H  by  HS(H). 

We  will  make  use  of  the  following  standard  (and  elementary)  result 
characterizing  Hilbert-Schmidt  operators  on  L2(Mn)  in  terms  of  integral 
operators.  (See,  for  example,  Theorem  VI. 23  in  Volume  I  of  [34].) 

Proposition  13.6  If  n  is  in  L2(Mn  x  Mn)  then  for  every  if  E  L2(Mn),  the 
integral 

AK{ip){yi)  :=  f  k(x, y)V’(y)  dy  (13.25) 

JRn 


is  absolutely  convergent  for  almost  every  x  E  W1 ,  and  A^(if)  also  belongs 
to  L2(Mn).  Furthermore ,  the  operator  AK  is  a  Hilbert-Schmidt  operator  on 
L2(Mn)  and 


L2(Rn  xln)  • 


Conversely,  for  any  Hilbert-Schmidt  operator  A  on  L2(Mn),  there  exists 
a  unique  n  E  L2(Mn  x  Mn)  such  that  A  =  AK. 


We  are  now  ready,  using  discussion  in  Sect.  13.3.1  as  motivation,  to  define 
the  Weyl  quantization  of  L2  symbols. 

Definition  13.7  For  all  f  E  L2(M2n),  define  Kf  :  M2n  eC  by 


«/(x>y)  =  (27rS-)  "  [  /((x  +  y)/2,p)e  *(y  x)'p/fi  dp,  (13.26) 

Jr n 


and  define  the  Weyl  quantization  of  f ,  as  an  operator  on  L2(Mn),  by 

Qwey\(f)  =  AKf, 
where  AKf  is  defined  by  (13.25). 

The  integral  in  (13.26)  is  not  necessarily  absolutely  convergent,  and 
should  be  understood  as  computing  a  partial  Fourier  transform.  Thus,  we 
should,  strictly  speaking,  replace  the  right-hand  side  of  (13.26)  with 

lim  (2t Th)~n  f  /((x  +  y)/2,p)e-i(y-x)'p/fi  dp,  (13.27) 

R^°°  J\p\<r 

where  the  limit  is  in  the  norm  topology  of  L2(M2n).  [The  partial  Fourier 
transform  maps  the  Schwartz  space  S(R2n)  to  itself.  By  Fubini’s  theorem 
and  the  Plancherel  formula  for  Mn,  the  partial  Fourier  transform  is  an  L2- 
isometry  and  extends  to  a  unitary  map  of  L  2(M2n)  to  itself.  This  unitary 
map  can  be  computed  by  the  usual  formula  on  functions  in  L 1  D  L2  and 
can  be  computed  by  the  limiting  formula  similar  to  (13.27)  in  general.] 

In  words,  we  may  describe  the  procedure  for  computing  Kf  at  a  point 
(xx,x2)  in  M2n  as  follows.  First,  compute  the  partial  Fourier  transform  Fp 
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of  /(x,  p)  in  the  p-variable,  resulting  in  the  function  (^rp/)(x,  £).  Then 
evaluate  Fpf  at  the  point  x  =  (x1  -T  x2)/2,  £  =  (x2  —  yp)/h.  Finally, 
multiply  the  result  by  h~n(2ir)~nt2  to  get 

/^(x1,  x2)  =  h~n(27r)~n/2(Jrpf)((x1  +  x2)/2,  (x2  —  x1)/^).  (13.28) 

Theorem  13.8  The  map  Qwey\  is  a  constant  multiple  of  a  unitary  map 
of  L2(M2n)  onto  HS(L2(Mn)).  The  inverse  map  Q^yi  :  HS (L2(Mn)) 
L2(R2n)  is  given  by 

Qwlyi(^)(x’  P)  =  hn  [  ^(X  ~  fib/2,  x  +  frb/2)e*b'p  db, 

Jr™ 

where  n  is  the  integral  kernel  of  A  as  in  Proposition  13.6. 

Furthermore,  for  all  f  E  L2(M2n),  we  have  Qwey\(f)  =  Qweyi(f)* >  in 
particular,  Qwey\(f)  ^  self-adjoint  if  f  is  real  valued. 

Properly  speaking,  the  integral  in  the  theorem  should  be  understood 
as  an  L2  limit,  as  in  (13.27).  The  fact  that  Qweyi  is  unitary  (up  to  a  con¬ 
stant)  tells  us  that  for  an  appropriate  constant  c,  the  operators  ceha-x+b  P) 
form  an  “orthonormal  basis  in  the  continuous  sense”  for  the  Hilbert  space 
HS(L2(Mn)).  (Compare  Sect.  6.6.) 

It  is  possible,  using  the  same  formulas,  to  extend  the  notion  of  Weyl 
quantization  to  symbols  belonging  the  space  of  tempered  distributions, 
that  is,  the  space  of  continuous  linear  functionals  on  S(R2n).  We  will  not, 
however,  develop  this  construction  here.  See  [11]  for  more  information. 
Proof.  Proposition  13.6  gives  a  unitary  identification  of  HS(L2(Mn))  with 
L2(Rn  x  Mn).  Thus,  it  suffices  to  show  that  the  map  f  K,f  is  a  multiple 
of  a  unitary  map.  This  result  holds  because  the  partial  Fourier  transform 
is  a  unitary  map  of  L2(M2n)  to  itself  and  composition  with  an  invertible 
linear  map  is  a  constant  multiple  of  a  unitary  map.  The  inverse  of  the  map 
f  K,f  is  obtained  by  inverting  the  linear  map  and  undoing  the  partial 
Fourier  transform.  Finally,  it  is  apparent  from  (13.26)  that 

«/(x>y) =  K/(y>x)- 

This,  along  with  Exercise  6,  shows  that  Qwey\(f)  =  Qweyi(/)*-  ■ 

13.3.3  The  Composition  Formula 

If  /  and  g  are  L 2  functions  on  M2n,  then  Qweyi(f)  and  Qwey\(g)  are  Hilbert- 
Schmidt  operators,  in  which  case  their  product  is  again  Hilbert-Schmidt. 
(Indeed,  the  product  of  a  Hilbert-Schmidt  operator  and  a  bounded  operator 
is  always  Hilbert-Schmidt.)  Thus,  since  Qweyi  is  a  bijection  of  L2(R2n)  with 
HS(L2(Mn)),  there  is  a  unique  L2  function,  which  we  denote  by  f  *g,  such 
that 


Qwey\(f)Qwey\(g)  =  Qwey\(f  *#)• 


(13.29) 
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(Of  course,  the  operator  ★,  like  the  Weyl  quantization  itself,  depends  on  h , 
but  we  suppress  this  dependence  in  the  notation.) 

Proposition  13.9  The  Moyal  product  f*g  may  be  characterized  in  terms 
of  the  Fourier  transform  as 


a,b)  =  (27r)  ” 


—  ih(a-W  —  b-a/)/2 

G 


x  /(a  —  a7,  b  —  b7)g(a7,  b7)  da.'  db7, 


where  both  integrals  are  over  Mn. 

Note  that  if  we  set  h  =  0  in  the  above  formula,  /  ★  g  reduces  to  (27r)-n 
times  the  convolution  of  /  and  g,  which  is  nothing  but  the  Fourier  transform 
of  fg.  It  is  thus  not  difficult  to  show  (Exercise  10)  that 

lim  f*g  =  fg ■ 
h—t  o+ 

That  is  to  say,  the  Moyal  product  /  ★  g  is  a  “deformation”  of  the  ordinary 
pointwise  product  of  functions  on  M2n.  More  generally,  the  Moyal  product 
can  be  expanded  in  an  asymptotic  expansion  in  powers  of  d,  as  explained 
in  Sect.  2.3  of  [11].  This  expansion  terminates  in  the  case  that  /  and  g  are 
both  polynomials. 

Proof.  It  is,  of  course,  possible  to  obtain  this  formula  using  kernel  func¬ 
tions.  It  is,  however,  easier  to  work  with  the  (13.17),  which  can  be  shown 
(Exercise  7)  to  give  the  same  result  as  Definition  13.7  when  /  is  a  Schwartz 
function.  We  assume  standard  properties  of  the  Bodmer  integral  for  func¬ 
tions  with  values  in  a  Banach  space  [in  our  case,  m)},  which  are  similar 
to  those  of  the  Lebesgue  integral.  (See,  for  example,  Sect.  V. 5  of  [46].) 

We  have,  then, 


Qwey\(f)Qwey\(g) 


(13.30) 


Now,  it  is  an  easy  calculation  to  verify,  using  Proposition  13.5,  that 

gi(a-X+b-P) gi(a' -X+b' -P)  _  e-i^(a-b/-b-a/)/2ei((a+a/)-X+(b+b/)-P)  ^3  3-^ 

which  is  what  one  obtains  by  formally  applying  the  special  case  of  the 
Baker-Campbell-Hausdorff  formula  in  (13.18).  Thus,  we  may  combine  the 
integrals  in  (13.30)  to  obtain 

Qweyl(/)Qweyl(<?)  =  (27 f fj ' e-ifi(ab'-ba')/2ei((a+a')X+(b+b')-P) 

x  /(a,  b)^(a7,  b7)  da  dh  da'  dh' . 
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By  introducing  new  variables  c  =  a  +  a7  and  d  =  b  +  h'  in  the  a  and  b 
integrals  and  reversing  the  order  of  integration,  we  obtain,  after  simplifying 
the  exponent, 


Q Wey  1  (  /  )  Q Wey  1  {d  ) 


=  (27r) 


-n 


[(27T) 


nil  ^-ih(c- 1/— d-ab/2 


x  /( c  —  a7,d  —  b/)^(a/,b/)  dsi  dh']  edc-x+d,p)  dc  dd 


From  this  and  (13.17),  we  see  that  Qwey\{f)Qwey\{g)  is  the  Weyl  quanti¬ 
zation  of  the  function  whose  Fourier  transform  is  the  quantity  in  square 
brackets  above,  which  is  what  we  wanted  to  show.  ■ 


Proposition  13.10  The  Moyal  product  f*g  extends  to  a  continuous  map 
o/L2(M2n)  x  L2(M2n)  into  L2(M2n)  and  the  composition  formula  (13.29) 
holds  for  all  f  and  g  in  L2(M2n). 

Proof.  A  standard  inequality  asserts  that  for  any  two  Hilbert-Schmidt 
operators  A  and  H,  we  have 


AB 


HS  ‘ 


It  follows  that  the  product  map  (A,  B )  i— >>  AB  is  a  continuous  map  of 
HS (L2(Mn))  x  HS (L2(Mn))  to  HS (L2(Mn)).  Meanwhile,  the  Weyl  quantiza¬ 
tion  is  a  constant  multiple  of  a  unitary  map  from  L  2(M2n)  to  HS (L2(Mn)). 
For  Schwartz  functions  /  and  g,  the  Moyal  product  is  nothing  but 

/  *  9  =  Qwey\(Qwey\{f)Qwey\{g))-  (13.32) 

The  right-hand  side  of  (13.32)  provides  the  desired  continuous  extension  of 
/  *g.  Clearly,  the  composition  formula  (13.29)  holds  for  this  extension.  ■ 


13.3.4  Commutation  Relations 

In  quantum  mechanics,  the  commutator  of  two  operators  (divided  by  ih) 
plays  a  role  similar  to  that  of  the  Poisson  bracket  in  classical  mechanics. 
Thus,  we  may  naturally  ask:  To  what  extent  does  the  Weyl  quantization 
(or  any  other  quantization  scheme)  map  Poisson  brackets  to  commutators? 
The  short  answer  is:  Not  always.  Indeed,  as  we  will  see  in  Sect.  13.4,  no 
“reasonable”  quantization  scheme  can  give  an  exact  correspondence  be¬ 
tween  {/,  g}  on  the  classical  side  and  [A,  B\/(iti)  on  the  quantum  side. 
Nevertheless,  such  an  exact  correspondence  does  hold  for  various  special 
classes  of  symbols.  If  we  consider,  for  example,  the  class  of  symbols  that 
depend  only  on  x  and  not  on  p,  then  on  the  classical  side,  all  such  functions 
Poisson  commute.  The  Weyl  quantization  maps  such  functions  /(x)  to  the 
operator  of  multiplication  by  /(x),  and  thus  the  quantizations  of  any  two 
such  functions  commute.  A  more  interesting  (in  particular,  noncommuta- 
tive)  example  is  as  follows. 
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Proposition  13.11  Suppose  f  is  a  polynomial  in  x  and  p  of  degree  at 
most  2  and  g  is  an  arbitrary  polynomial  in  x  and  p.  Then 

-7^  [Qwey\(f),Qwey\{g)}  =  Qweyl({/>  d}) j  (13.33) 

where  {f,g}  is  the  Poisson  bracket  of  f  and  g. 

Here,  we  define  the  Weyl  quantization  by  the  obvious  n- variable  exten¬ 
sion  of  Definition  13.1,  and  we  regard  all  operators  as  operating  simply 
on  CD°(Mn).  See  Exercise  8  for  another  class  of  symbols  on  which  (13.33) 
holds.  Although  the  requirement  that  g  be  a  polynomial  can  be  relaxed, 
we  will  not  attempt  to  obtain  the  optimal  version  of  the  result. 

Proof.  For  notational  simplicity,  we  abbreviate  Qweyi(f)  t°  Q(f)  for  the 
duration  of  the  proof.  If  /  has  degree  zero,  then  both  sides  of  the  desired 
equality  are  zero.  Turning  to  case  in  which  /  has  degree  1,  we  use  the  n- 
variable  extension  of  Proposition  13.4,  the  proof  of  which  is  essentially  the 
same  as  the  1- variable  result.  The  result  is  as  follows: 

Qixjg)  =  Q(xj)Q(g)  -  yQ  (Jy) 

=  Q(g)Q(xj)  +  jQ  (ll)  • 

By  subtracting  these  two  formulas  and  rearranging,  we  get 

f[Q(xj),Q(g)}  =  Q  =  Q({x3>9})- 

A  very  similar  argument  establishes  the  desired  result  when  /  =  p3  and 
thus  for  ah  homogeneous  polynomials  of  degree  1. 

Suppose  now  that  /i  and  are  homogeneous  polynomials  of  degree 
1  in  x  and  p.  Then  it  follows  easily  from  Proposition  13.4  that  for  any 
polynomial  h,  we  have 

Qtfjh)  =  \{Q{fj)Q(h)  +Q(h)Q(fj)),  j  =  1,2.  (13.34) 

In  particular,  we  have 

Q(/i/2)  =  \(Q(h)Q(f2)  +  Q{f2)Q{h)).  (13.35) 

Using  (13.35)  and  the  product  rule  for  commutators  (Proposition  3.15),  we 
have 

y[Q(/i/2),QO)] 

=  h^([Q(/i),  Q(g)]Q(f2)  +  Q(/i)[Q(/2),  Q(g)] 

+  [Q(/2),  Q(g)]Q(fi)  +  Q(/2)[Q(/i),  Q(g)])- 


270 


13.  Quantization  Schemes  for  Euclidean  Space 


Using  the  degree- 1  case  of  the  result  we  are  trying  to  prove,  along  with 
(13.34),  we  get 

-T^[Q(fif2),Q(g)\  =  l,(Q({fi,g})Q(f2)  +  Q(fi)Q{{f2,g}) 

+  Q({f2,g})Q(fi)  +  Q(f2)Q({fi,g})) 

=  QCM/iiS})  +  Q(h{f2,g}) 

=  Q({hh,g}),  (13.36) 


where  in  the  last  equality  we  have  used  the  product  rule  for  the  Poisson 
bracket.  We  have  now  established  the  desired  result  when  /  is  a  homoge¬ 
neous  polynomial  of  degree  0,  1,  or  2.  ■ 

At  first  glance,  it  appears  that  one  could  extend  the  result  to  the  case 
where  /  has  degree  3,  by  considering  three  homogenous  polynomials  /i,  /2, 
and  fs  of  degree  1  and  symmetrizing  as  in  (13.35).  The  argument  breaks 
down,  however,  because  the  Q(fjY s  do  not  commute.  The  Q(fjY s  will  not 
always  occur  in  the  correct  order  to  allow  us  to  pull  the  fj  ’s  back  inside  the 
Weyl  quantization,  the  way  we  did  in  (13.36)  in  the  degree-2  case.  Indeed, 
an  elementary  but  tedious  calculations  shows  that 


1 

ift 


lQwey\{%Zp)iQwey\(%PZ)]  =  3  X2  P2 


6ihXP  -  ft2i , 


whereas 

Qwey\({x2 P,  xp2})  =  3 X2P2  —  6iftXP  —  -  ft2 1, 

so  that  the  two  expressions  differ  by  ft2 1/2. 

We  conclude  this  section  with  a  brief  glimpse  of  an  important  “equivari- 
ance”  property  of  the  Weyl  quantization.  Note  that  the  Poisson  bracket  of 
two  real  valued  homogeneous  polynomials  of  degree  2  is  again  real  valued 
and  homogeneous  of  degree  2.  The  space  of  real  homogeneous  polynomials 
of  degree  2  thus  forms  a  Lie  algebra  (Sect.  16.3)  with  respect  to  the  Poisson 
bracket.  This  Lie  algebra  is  naturally  isomorphic  to  the  Lie  algebra  sp(n;  R) 
of  Lie  group  Sp(n;  R),  the  real  symplectic  group.  This  group  is  the  group  of 
invertible  linear  transformations  that  preserve  a  skew-symmetric  form  on 
M2n.  See  Chap.  16  for  information  about  Lie  groups  and  their  Lie  algebras. 

If  we  apply  Proposition  13.11  in  the  case  in  which  both  /  and  g  are 
homogeneous  of  degree  2,  we  see  that  the  map  i r(/)  :=  Qweyi(f)  is  a  repre¬ 
sentation  of  sp(n;M)  in  the  space  of  skew-symmetric  operators  on  L2(Mn). 
It  can  be  shown  that  associated  to  this  representation  of  sp(n;  R)  there  is 
a  projective  unitary  representation  II  of  the  group  Sp(n;R),  known  as  the 
metaplectic  representation.  (See,  again,  Chap.  16  for  definitions.)  Proposi¬ 
tion  13.11  is  the  infinitesimal  version  of  the  following  equivariance  property 
of  the  Weyl  quantization:  For  all  A  E  Sp(n;M)  and  all  /  E  L2(M2n),  we 
have 


Qwey\{f  °  A  -1)  =  n(A)Qweyl(/)n(A)  ^ 
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See  Theorem  2.15  and  Chap.  4  of  [11]  [where  our  11(A)  corresponds  to 
/i((A*)-1)  in  Folland’s  notation]  for  this  result  and  much  more  about  the 
metaplectic  representation. 


13.4  The  “No  Go”  Theorem  of  Groenewold 

In  Sect.  13.3.4,  we  noted  that  the  Weyl  quantization  on  polynomials  satisfies 

T^[Qweyl(/)j  Qwey\(g)]  =  Qweyl({/,  g}) ,  (13.37) 

provided  that  /  is  a  polynomial  of  degree  2,  but  not  in  general.  One  might 
think  that  the  failure  of  (13.37)  represents  a  shortcoming  in  the  definition 
of  the  Weyl  quantization,  which  could  be  remedied  by  an  alternative  defini¬ 
tion.  In  this  section,  however,  we  will  see  that  no  quantization  scheme  that 
maps  Xj  and  pj  to  the  usual  position  and  momentum  operators  X3  and  P3 
can  satisfy  (13.37)  for  general  polynomials  in  x  and  p.  This  sort  of  nonex¬ 
istence  result,  of  a  construct  satisfying  seemingly  natural  and  desirable 
conditions,  is  referred  to  in  the  physics  literature  as  a  “no  go”  theorem. 

In  light  of  this  result,  one  might  think  that  perhaps  the  position  and 
momentum  operators  should  be  defined  differently,  possibly  with  an  ac¬ 
companying  change  in  the  choice  of  the  quantum  Hilbert  space.  Indeed, 
there  is  a  map  Q  that  satisfies  (13.37)  for  all  /  and  g,  namely  the  pre¬ 
quantization  map  described  in  Sect.  23.3.  The  prequantization  map  accom¬ 
plishes  this  feat  by  drastically  enlarging  the  quantum  Hilbert  space,  from 
P2(Mn)  to  P2(M2n).  The  Hilbert  space  L2  (M2n)  is  considered  to  be  “too 
big”  from  a  physical  standpoint,  which  explains  why  the  map  Q  is  only 
“prequantization”  rather  than  “quantization.”  (The  prequantization  map 
has  a  number  of  other  undesirable  features  that  are  described  in  Sect.  23.3.) 
If  one  imposes  a  natural  “smallness”  assumption  on  the  quantum  Hilbert 
space  (irreducibility  under  the  action  of  the  position  and  momentum  op¬ 
erators),  then  the  Stone-von  Neumann  theorem  will  tell  us  that  (modulo 
certain  technical  domain  assumptions)  any  choice  of  position  and  momen¬ 
tum  operators  satisfying  the  canonical  commutation  relations  is  unitarily 
equivalent  to  the  usual  ones. 

The  upshot  of  the  discussion  in  the  two  preceding  paragraphs  is  that 
there  is  no  physically  reasonable  quantization  scheme  that  satisfies  (13.37) 
for  all  (polynomial)  functions  /  and  g. 

We  turn,  now,  to  Groenewold’s  “no  go”  theorem.  We  need  to  make 
domain  assumptions,  so  that  it  makes  sense  to  compute  the  commuta¬ 
tors  of  the  quantized  operators.  The  simplest  approach  is  to  assume  that 
the  quantization  Q(f)  of  any  polynomial  /  will  be  in  the  algebra  gener¬ 
ated  by  the  X’s  and  P’s,  and  thus  that  Q(f )  will  be  a  differential  operator 
with  polynomial  coefficients.  There  is  a  variant  of  this  result,  known  as  van 
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Hove’s  theorem,  that  proves  a  similar  “no  go”  result  under  a  more  gen¬ 
eral  assumption  about  the  form  of  the  quantized  operators.  See  [15]  for  a 
rigorous  proof  of  van  Hove’s  theorem. 

Definition  13.12  For  any  k  >  0,  let  Vk  denote  the  space  of  homogeneous 
polynomials  of  degree  k  and  let  V<k  denote  the  space  of  all  polynomials  of 
degree  at  most  k. 

Theorem  13.13  (Groenewold’s  Theorem)  LetV(Rn)  denote  the  space 
of  differential  operators  on  W1  with  polynomial  coefficients.  There  does  not 
exist  a  linear  map  Q  :  V<4  —>  P(Mn)  with  the  following  properties. 

T  Q(l)  =  I. 

2.  Q(xj)  =  Xj  and  Q(pj )  =  Pj. 

3.  For  all  f  and  g  in  P<3,  we  have 

Q({f,g})  =  (13.38) 


Note  that  in  Property  3  of  the  theorem,  we  assume  that  /  and  g  belong 
to  V<3  rather  than  V<4.  This  assumption  guarantees  that  {/,  g}  belongs 
to  P<4,  so  that  the  left-hand  side  of  (13.38)  is  defined. 

Our  strategy  in  proving  Groenewold’s  theorem  is  the  following.  We  know 
(Proposition  13.11)  that  the  Weyl  quantization  satisfies  (13.38)  if  /  has 
degree  at  most  2  and  g  has  degree  at  most  3.  Using  this  result,  we  can 
show  that  any  map  Q  satisfying  the  properties  in  Theorem  13.13  must 
coincide  with  the  Weyl  quantization  on  V<3.  We  then  identify  a  polynomial 
/  G  P 4  that  can  be  expressed  as  a  Poisson  bracket  in  two  different  ways, 
/  =  {g,  h]  =  {g',h'},  with  g,  h ,  and  hf  in  V3.  Upon  calculating  that 
[Qwey\(g),Qwey\(ti)]  does  not  Coincide  with  [Qwey\(g'),  Qwey\(h')],  we  will 
have  a  contradiction. 

The  proof  will  consist  of  several  lemmas,  followed  by  the  coup  de  grace. 
Lemma  13.14  Consider  an  element  A  ofT>( Mn)  expressed  as 

a  =  T  ^k(x) 

k 


where  k  ranges  over  multi-indices ,  where  the  /k  7s  are  polynomials ,  and 
where  only  finitely  many  of  the  /k  are  nonzero.  Then  A  is  the  zero  oper¬ 
ator  on  Cf°(Rn)  only  if  each  of  the  /k  ?s  is  zero. 


Proof.  For  each  multi-index  k,  let  |k|  =  Aq  +  •  •  •  +  kn.  Suppose  not  all 
the  /k’s  are  zero,  let  N  be  the  smallest  non-negative  integer  for  which  /k 
is  nonzero  for  some  k  with  k  =  TV,  and  let  ko  be  some  multi-index  with 
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|ko|  =  N  and  /k0  7^  0.  Let  us  apply  A  to  a  function  g  that  is  equal,  in  a 
neighborhood  of  the  origin,  to  xk°.  Then  all  the  terms  in  Ag  other  than 
the  /k0  term  will  be  zero  in  a  neighborhood  of  the  origin,  whereas  the  /k0 
term  will  be  a  nonzero  constant  in  a  neighborhood  of  the  origin.  Thus,  A 
is  not  the  zero  operator.  ■ 


Lemma  13.15  If  A  belongs  to  P(Mn)  and  A  commutes  with  X3  and  P3 
for  all  j  =  1, . . . ,  n,  then  A  =  cl  for  some  c  E  C. 


Proof.  We  may  easily  prove  by  induction  that 

k  /  n  \  k  —  1 


d 


dXj 


(Xj9(x))  =  k 


d 


dxj 


<?(x)  +  x. 


d 


k 


dxj 


s(x) 


for  any  polynomial  g.  Thus,  for  any  multi-index  k,  we  have 


/(x) 


d_ 

d~x 


=  fcj/(x) 


d_ 

d~x 


k  — e 


(13.39) 


Suppose  A  is  a  nonzero  element  of  P(Mn)  that  commutes  with  each  Xj. 
If  deg  (A)  =  M,  consider  a  nonzero  term  in  A  of  degree  M: 


ho  (x) 


M,  ho  ¥=  0. 


If  M  >  0,  we  can  pick  some  j  such  that  the  jth  entry  of  ko  is  nonzero. 
By  (13.39)  and  our  assumption  on  A ,  we  have 


0  =  [A,Xj\  =  (k0)j/ko  (x) 


d_ 

5x 


-j-  other  terms, 


where  the  other  terms  involve  multi-indices  of  the  form  k  —  e^,  with  k  7^  ko- 
Thus,  by  Lemma  13.14,  [A,  Xj]  is  not  the  zero  operator. 

We  see,  then,  that  any  A  E  T>(Rn)  that  commutes  with  each  Xj  must  be 
of  degree  zero;  that  is,  A  must  simply  be  multiplication  by  some  polynomial 
/(x).  If,  in  addition,  A  commutes  with  each  Pj,  then 


0  =  [/(x),Pj 


Thus,  actually,  /  must  be  constant  and  A  is  a  multiple  of  the  identity 
operator.  ■ 


Lemma  13.16  For  any  f  E  V2,  there  exist  gi, . . . ,  gj  and  h\,...,hj  in  V2 
such  that 

f  =  {#1?  hi}  +  •  •  •  +  {#j,  hj}. 

Furthermore,  for  any  f  E  P3,  there  exist  elements  g[, . . . , g'kof  V3  and 
h'1: ,  h'k  of  V2  such  that 

f  =  Wi,  K}  H  h  {g'k,  h'k}. 
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Proof.  See  Exercise  12.  ■ 

Lemma  13.17  If  Q  satisfies  the  conditions  in  Theorem  13.13,  then  Q 
coincides  with  Qwey\  on  P<3- 

Proof.  Our  argument  leans  heavily  on  Proposition  13.11.  Note  that,  by 
assumption,  Q  coincides  with  Qweyi  on  T<\.  For  f  £  V 2,  let  us  write 

Q(f)  as 

Q(f)  =  Qwey\(f)  +  Af. 

For  any  g  £  P<i,  we  have,  by  (13.38)  and  Proposition  13.11, 

Q{{f,g})  =  f[Q{f)iQ{g)\ 

1  1 

=  ^[Qweyl(/)>  QweylfeO]  +  [Af ,  Q\Veyl  (<?)] 

1 

=  Qwey\{{fi  g})  +  Qwey\(g)} 

=  Q({f,g})  +  >  Qweyl(^)],  (13.40) 

since  {/, g}  £  V<\.  Thus,  [Af,Qwey\(g)\  =  0  for  every  g  £  V\,  and  so,  by 
Lemma  13.15,  we  must  have  Af  =  Cf  I  for  some  constant  Cf. 

Now,  if  h  is  in  T2,  we  have,  by  the  just-established  result  and  Proposi¬ 
tion  13.11, 


QttfM) 


1 

T^T  [Qwey\(f )  +CfI ,  Qweyl(^)  ChI\ 

1 

T^[Q\Veyl(/),  Qweyl(^)] 

Qweyl({/,  h}). 


(13.41) 


That  is  to  say,  Q  and  Qweyi  agree  on  elements  of  V2  of  the  form  {/,/&},  for 
/,  h  £  p2-  Thus,  by  Lemma  13.16,  Q  and  Qweyi  agree  on  all  of  P2,  and  so 
on  all  of  V<2- 

We  now  use  the  V<2  case  of  the  lemma  to  establish  the  V3  case.  Given  /  £ 
P3,  we  write  Q(f)  =  Qwey\{f)  +  Bf.  Given  g  £  V<  1,  we  have  {f,g}  £  V<2- 
Thus,  we  may  argue  as  in  (13.40),  applying  the  just-established  V<2  case  of 
the  lemma  to  {/,  g}  in  the  last  step.  The  conclusion  is  that  [Bf,  Q(g)\  =  0 
for  all  /  £  V<2  and  thus,  by  Lemma  13.15,  that  Bf  =  dfl  for  some  constant 
df.  Meanwhile,  if  h  £  V2,  we  argue  as  in  (13.41),  but  with  Cf  replaced  by 
df  and  with  Ch  now  known  to  be  zero.  The  conclusion  is  that  Q  agrees  with 
Qweyi  for  all  elements  of  P3  of  the  form  {/,  h}  with  f  £  V 3  and  h  £  V 2, 
and  thus,  by  Lemma  13.16,  for  all  elements  of  V3 .  ■ 
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Proof  of  Theorem  13.13.  Assume,  toward  a  contradiction,  that  a  map  Q 
as  in  the  theorem  exists.  Let  /  be  the  polynomial  given  by 

/(x,p)  =  x\p\. 


We  observe  that  /  can  be  written  in  two  different  ways  as  a  Poisson  bracket: 


xiPi  =  =  \{xiPi,  xip\)  ■ 


Thus,  by  Lemma  13.17,  we  must  have 
\ 

g  [Qweyl(^l)?  Qweyl(Pl).  — 

l 

=  -[Qweyl(^lPl),  Qweyl(%lPi)]- 

On  the  other  hand,  if  we  apply  both  commutators  to  the  constant  func¬ 
tion  1  (or  to  a  function  equal  to  1  in  a  neighborhood  of  the  origin),  we 
obtain 

g[Qweyl(a;?),(3weyl(P?)]l  =  -(XfPf  - 

Meanwhile,  if  we  compute  the  quantizations  as  in  (13.4)  and  then  drop  all 
terms  involving  Pil,  we  obtain  (after  a  small  computation) 


l[<9weyl(*lPl),Qweyl(*lPl)]l  =  P?  X,  +  P^P?  X^l 

-  ^{X^xl  +  p^x1p1xf)i 


1 

12 

1 

12 


P?X1P1Xfl 


(~ih)3 4  •  1, 


Since  6/9  does  not  equal  4/12,  we  have  a  contradiction. 


13.5  Exercises 

1.  Let  Vj  denote  the  space  of  complex- valued  homogeneous  polynomials 
on  M2  of  degree  j.  Then  V3  is  a  complex  vector  space  of  dimension 
j  + 1,  which  we  may  identify  with  CJ+1  using  the  obvious  basis  for  Vj. 
Let  Vj  denote  the  complex  subspace  of  V3  spanned  by  polynomials 
of  the  form  ( ax  -j-  6p)J,  with  a,  b  e  C.  Show  that  Vj  =  Vj. 

Hint :  Since  every  subspace  of  CJ+1  is  (topologically)  closed,  if  j(t)  is 
a  smooth  curve  in  Vj,  the  derivative  7 ' (t)  will  also  he  in  Vj. 
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2.  Show  that  symmetrized  pseudodifferential  operator  quantization  of 
x2p 2  is  equal  to  Qweyi(^2P2)  —  h2  /  2. 

3.  Show  that  Wick-ordered  and  anti- Wick-ordered  quantizations  map 
real- valued  polynomials  to  symmetric  operators  on  C£°(R). 

Hint :  Compare  the  values  of  each  quantization  scheme  on  zkzl  and 
on  (. zkzl ). 


4.  Consider  a  classical  harmonic  oscillator  with  Hamiltonian 


H(x,p)  =  — - 1 — rn(jj2x2  =  -rrno2  (  x2  + 

v  2m  2  2 


P  \ 
moo  ) 


where  oj  is  the  frequency  of  the  oscillator.  Consider  the  Wick-  and 
anti-Wick-ordered  quantizations  with  parameter  a  =  1  /(mu).  Show 
that 


Qwick(H")  =  Qweyl(^)  — 

1 

Q anti  — Wick  (-^0  —  Q Weyl(-^0  4“ 


5.  Let  Ua^b(t)  be  as  in  Proposition  13.5.  Show  by  direct  calculation  that 
these  operators  form  a  one-parameter  unitary  group. 

6.  Given  n  E  T2(Mn  xMn),  let  AK  denote  the  associated  integral  operator 
on  L2(Mn),  as  in  Proposition  13.6.  Show  that  the  adjoint  A*  of  A  is 
also  an  integral  operator,  with  integral  kernel  n!  given  by 

«'(x,y)  =  «(y,x). 

7.  Suppose  that  /  £  L2( R2n)  and  that  f  £  L1(R2n).  Then  the  right- 
hand  side  of  (13.17)  may  be  understood  as  an  absolutely  convergent 
“Bochner”  integral  with  values  in  the  Banach  space  B(L2(Mn)).  Show 
that  Qweyi(f)  as  defined  by  (13.17)  coincides  with  Qwey\(f)  as  de¬ 
fined  in  Definition  13.7. 

Hint :  The  Bochner  integral  commutes  with  applying  a  bounded  lin¬ 
ear  functional.  Use  this  result  with  the  linear  functional  (A)  := 
{(/),A'ijj)  on  B(L2(Mn)).  Then  use  the  expression  in  (13.23)  for  Kf, 
which  follows  from  Definition  13.7  by  applying  a  partial  Fourier  trans¬ 
form. 

(a)  Show  that  for  any  polynomial  /  in  one  variable,  we  have 

i  h 

Qweyl(/(*)p)  =  f{X)P  ~  -jf(X). 


8. 
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9. 


(b)  Show  that  for  any  two  polynomials  /  and  g ,  the  Poisson  bracket 
{f(x)Pi  9(x)p}  is  of  the  form  h(x)p  for  some  polynomial  h. 

(c)  Show  that  for  any  two  polynomials  /  and  g ,  we  have 

1 

[Qwey\{f{x)p),Qwey\{g(x)p)\  =  Q Wey\ {{  f  {x)p,  g(x)p}) . 

(a)  Given  </>  and  ip  in  L2(Mn),  let  \(f)){^\  be  the  operator  defined  in 

Notation  3.28.  Show  that  can  be  expressed  as  an  integral 

operator  as  in  Proposition  13.6  and  determine  the  associated 
integral  kernel  n. 

(b)  For  cr  >  o,  let  ipa  G  L2(Mn)  be  given  by  the  expression 


Vv(x)  =  (7 rex) 


-™/4g-|xl  /(2cr) 


Using  Proposition  A. 22,  show  that  ipa-  is  a  unit  vector  in  L2(Mn) 
and  that  the  Weyl  symbol  of  the  corresponding  one-dimensional 
projection  operator  |Vv)(Vv|  is  given  by 


Qweyld^X^I)  —  ^ Ue 


Note:  If  we  give  a  the  value  the  Gaussian  function  ij)a  may 

be  thought  of  as  the  ground  state  for  an  n-dimensional  harmonic  os¬ 
cillator.  (Compare  the  functions  in  Theorem  11.3.)  The  computation 
in  this  exercise  plays  an  important  role  in  the  proof  of  the  Stone-von 
Neumann  theorem  in  Chap.  14.8. 

10.  If  /  and  g  are  Schwartz  functions  on  M2n,  show  that  f  *  g  converges 
in  the  Ll  norm  to  (27r)_n/*g,  where  *  denotes  convolution.  Conclude 
that  /  ★  g  converges  uniformly  to  fg  as  h  tends  to  zero. 

11.  Suppose  that  /( p,  q)  is  a  homogeneous  polynomial  of  degree  2.  Show 
that  for  each  £,  the  Hamiltonian  flow  associated  with  /  is  a  linear 
map  of  M2n  to  itself. 

12.  Prove  Lemma  13.16. 

Hint:  Let  g\  G  V2  be  given  by 


n 

flifx.p)  =  'Yixjpj. 

3  = 1 


Show  that  for  any  monomial  of  the  form  x^pk,  we  have  {gi,xjpk}  = 
(|k|  —  |j|)xjpk.  Thus,  most  of  the  standard  basis  elements  /  for  V2 
and  all  of  the  standard  basis  elements  /  for  V3  can  be  obtained  as 
nonzero  multiples  of  {#i,  /}. 


14 

The  Stone-von  Neumann  Theorem 


The  Stone-von  Neumann  theorem  is  a  uniqueness  theorem  for  operators 
satisfying  the  canonical  commutation  relations.  Suppose  A  and  B  are  two 
self-adjoint  operators  on  H  satisfying  [H,  B]  =  ihl.  Suppose  also  that  A 
and  B  act  irreducibly  on  H,  meaning  that  the  only  closed  subspaces  of 
H  invariant  under  A  and  B  are  {0}  and  H.  Then  provided  that  certain 
technical  assumptions  hold  (the  exponentiated  commutation  relations),  we 
will  conclude  that  A  and  B  are  unitarily  equivalent  to  the  usual  position 
and  momentum  operators  X  and  P.  That  is,  there  is  a  unitary  operator 
U  :  H  -A  L2(M)  such  that  U AU~X  =  X  and  UBU~l  =  P.  If  H  is  not 
irreducible,  then  it  decomposes  as  a  direct  sum  of  invariant  subspaces  Vi 
for  A  and  F>,  and  the  restrictions  of  A  and  B  to  each  Vi  are  unitarily 
equivalent  to  the  usual  X  and  P. 

We  begin  this  chapter  with  a  heuristic  argument  for  the  Stone-von  Neu¬ 
mann  theorem,  an  argument  that  glosses  over  certain  (essential  but  tech¬ 
nical)  domain  issues.  Then  we  introduce  the  exponentiated  commutation 
relations,  which  should  be  thought  of  as  a  sort  of  mild  strengthening  of 
the  ordinary  canonical  commutation  relations.  Finally,  we  give  a  precise 
statement  of  the  theorem  and  provide  a  proof. 


14.1  A  Heuristic  Argument 

Suppose  that  A  and  B  are  any  two  (possibly  unbounded)  self-adjoint  op¬ 
erators  on  a  separable  Hilbert  space  H  satisfying  [H,  B]  =  ihl.  What  we 
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would  like  to  conclude  is  that  H  looks  like  a  Hilbert  space  direct  sum  of 
closed  subspaces  Vi  that  are  invariant  under  A  and  T>,  and  such  that  each 
Vi  is  unitarily  equivalent  to  L2  (R)  in  a  way  that  turns  the  operators  A  and 
B  into  the  standard  position  and  momentum  operators  X  and  P.  That  is 
to  say,  we  hope  to  find  unitary  maps  Ui  :  Vi  L2(R )  such  that 

UiAUf1  =  X 
UiBUf 1  =  P. 

This  conclusion  is,  however,  not  quite  correct,  for  reasons  having  to  do 
with  the  domains  of  the  relevant  operators.  Nevertheless,  let  us  consider 
a  heuristic  argument  for  this  conclusion.  We  start  by  forming  a  lowering 
operator  a  and  a  raising  operator  a*  by  analogy  to  the  definitions  of  a  and 
a*  in  Chap.  11: 

mcoA  +  iB  muoA  —  iB 

a  =  —  :  a  =  —  . 

V2  hmu)  \/2hmuj 

Then  we  look  at  the  kernel  W  of  the  lowering  operator  a,  which  will  be  a 
closed  subspace  of  H,  provided  that  a  is  a  closed  operator.  The  elements 
of  W  may  be  thought  of  as  “ground  states”  for  the  operator  a* a.  Choose 
an  orthonormal  basis  {0q}  for  W  and  define  vectors 

<t>lm  ■  =  («*)"Vo- 

It  is  not  hard  to  show  that  for  l  ^  is  orthogonal  to  </>^,  for  all  m  and 

m! .  Let  Vi  denote  the  closed  span  of  the  vectors  m  =  0,l,2,.... 

Using  the  calculation  in  Sect.  11.2,  we  can  see  that  the  way  a  and  a*  act 
on  each  chain  (the  vectors  ^lm  with  l  fixed  and  m  varying)  is  precisely  the 
same  as  the  way  the  standard  lowering  and  raising  operators  a  and  a*  act 
on  the  chain  of  eigenvectors  for  a* a.  Thus,  for  each  /,  we  can  construct  a 
unitary  map  Ui  from  Vi  to  L2(M)  by  mapping  the  vectors  in  Vj  to  the 
vectors  in  L2(M)  described  in  Theorems  11.3  and  11.4.  (In  particular, 
the  vector  G  L2(R)  is  the  ground  state  for  the  harmonic  oscillator,  which 
is  a  Gaussian.)  Since  the  formula  for  how  a  and  a*  act  is  the  same  as  the 
formula  for  how  a  and  a*  act,  Ui  will  “intertwine”  a  with  a  and  a*  with 
a  and  a*,  meaning  that  Viol  =  aUi,  and  similarly  for  a*  and  a*.  It  follows 
that  Ui  also  intertwines  A  with  X  and  B  with  P. 

It  remains  only  to  argue  (heuristically)  that  the  spaces  Vi  fill  up  the  whole 
Hilbert  space  H.  Clearly,  the  span  V  of  the  T^’s  is  invariant  under  both 
a  and  a*.  Thus,  the  orthogonal  complement  V1-  of  V  is  invariant  under 
the  adjoints  a*  and  a.  If  V2-  is  not  zero,  then  arguing  as  in  Chap.  11, 
there  should  be  a  ground  state  in  Ux,  that  is  a  nonzero  vector  annihilated 
by  a.  This  vector  would  be  orthogonal  to  all  the  0q’s,  contradicting  the 
assumption  that  the  (/> q’s  form  an  orthonormal  basis  for  the  kernel  of  a. 

The  preceding  heuristic  argument  cannot  be  completely  rigorous,  how¬ 
ever,  since  the  counterexample  in  Sect.  12.2  gives  a  pair  of  operators  A 
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and  B  that  satisfy  the  canonical  commutation  relations  but  are  clearly  not 
unitarily  equivalent  to  the  usual  position  and  momentum  operators.  After 
all,  the  “position”  operator  A  in  that  section  is  a  bounded  operator,  which 
cannot  be  unitarily  equivalent  to  the  usual  position  operator. 

What  goes  wrong  is,  as  usual,  a  matter  of  domain  considerations.  Setting 
m,  h ,  and  ce  equal  to  1,  we  can  look  for  a  vector  </>o  that  is  annihilated  by 
the  operator 

a=V^{A  +  lB)  =  A^X+di)' 

By  the  same  argument  as  in  Chap.  11,  </>o  must  be  a  constant  multiple  of  the 
function  e~x  /2.  The  function  \  :=  a*(j) o  is  then  a  multiple  of  xe~x  /2.  The 
problem  is  that  <f\  is  not  in  the  domain  of  a*.  After  all,  f i  does  not  satisfy 
the  periodic  boundary  condition  'ip(-l)  =  ^(l)  that  defines  the  domain 
of  B.  Thus,  we  cannot  continue  to  apply  a*  to  obtain  an  orthogonal  chain 
of  vectors  and  the  entire  argument  breaks  down. 

What  we  need,  then,  is  some  additional  condition  that  will  distinguish 
between  the  “good”  cases  of  the  canonical  commutation  relations  and  the 
“bad”  cases.  One  possibility  for  this  additional  condition  is  the  exponen¬ 
tiated  form  of  the  canonical  commutation  relations,  which  are  discussed 
in  the  following  section.  Our  rigorous  proof  (Sect.  14.3)  of  the  Stone-von 
Neumann  theorem  will  follow  the  same  outline  as  the  heuristic  argument 
in  this  section,  except  that  the  unbounded  operators  a  and  a*  will  be  re¬ 
placed  by  certain  bounded  operators,  constructed  by  an  analog  of  the  Weyl 
quantization. 
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If  A  is  a  bounded  operator  on  a  Hilbert  space  H,  we  may  define  the  expo¬ 
nential  of  A,  denoted  either  eA  or  exp(A),  by  the  power  series 


e 


A 


E 


m— 0 


Am 
ml  ’ 


where  A0  =  I.  A  standard  power  series  argument  shows  that  if  A,  B  E 
6(H)  commute,  then 


eA+B  =  eAeB ,  [A,B}=  0.  (14.1) 

(See  Exercise  6  in  Chap.  16.)  Even  when  A  and  B  do  not  commute,  there 
is  a  formula,  called  the  B aker-Campbell-Haus dor ff  formula,  that  expresses 
eAeB ,  for  sufficiently  small  A  and  B ,  in  the  form 

{a  +  B  +  ±[A,B]  +  E  [a,  [A,  5]]  +  ...} 


A  B 

e  e  —  exp 
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where  the  terms  indicated  by  •  •  •  are  iterated  commutators  involving  A 
and  B.  (See  Chap.  3  of  [21]  for  more  information.)  A  very  special  case  of 
this  formula  is  obtained  in  the  case  where  A  and  B  commute  with  their 
commutator,  so  that  all  higher  commutators  are  zero. 

Theorem  14.1  Suppose  A,  B  E  B(  H)  commute  with  their  commutator, 
that  is, 

[A,  [A,B]]  =  [B,[A,B]]  =  0. 

Th  eu 

eAeB  =  eA+B+i{A,B]' 

This  relation  may  also  be  written  as 

eA+B  =  e~i[A'B]eAeB . 


Note  that  in  this  special  case  of  the  Baker-Campbell-Hausdorff  formula, 
no  smallness  assumption  is  imposed  on  A  and  B. 

Proof.  We  will  prove  that 


gtAgtB  _  g£(A+B)  +  A  [A,B] 


(14.2) 


which  reduces  to  the  desired  result  at  t  =  1.  Since  [A,  B]  commutes  with 
everything  in  sight,  we  can  use  (14.1)  to  split  the  exponential  on  the  right- 
hand  side  of  (14.2)  into  two  and  then  move  the  factor  involving  [A,  B]  to 
the  other  side.  Thus,  (14.2)  is  equivalent  to  the  relation 


etAetBe-t2[A,B]/ 2  =  et(A+B) 


(14.3) 


Let  a(t)  denote  the  left-hand  side  of  (14.3).  We  will  show  that  a(t)  satisfies 
a  simple  differential  equation,  which  may  be  solved  explicitly  to  obtain 
a(t)  =  edA+B). 

Using  term-by-term  differentiation,  it  is  easy  to  verify  that 

Aetc  =  Cetc  =  etcc 

dt 

for  any  CJ  €  23(H),  and  that 

LLL 

We  may  then  differentiate  a(t)  using  the  product  rule,  which  is  proved  the 
same  way  as  in  the  scalar  case,  giving 

=  etAAetBe-t2[A,B]/2  ,  tA  tB g  -t*[A,B]/ 2 

dt 

+  etAetBe~t2[A’B]/2(-t[A,B]). 
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To  simplify  our  expression  for  da/dt ,  we  need  an  intermediate  result.  By 
the  product  rule 


je~tBAetB  =  e~tB[A.  B]etB  =  [A,  B],  (14.4) 

because  B — and,  thus,  eB — commutes  with  [A,  B].  Noting  that  e~tB AetB  = 
A  when  t  =  0,  we  may  integrate  (14.4)  to  get 

e~tBAetB  =  A  +  t[A,  B].  (14.5) 

(The  difference  of  the  two  sides  of  (14.5)  has  derivative  zero,  so  by  Part  (a) 
of  Exercise  2,  the  two  sides  are  equal  up  to  a  constant,  which  is  seen  to  be 
zero  by  evaluating  at  t  =  0.) 

Using  (14.5),  we  obtain 

etAAetB  =  etAetB{e~tBAetB )  =  etAetB(A  +  t[A,  B}). 

Moreover,  since  everything  commutes  with  [7L,  T>],  we  may  commute  any¬ 
thing  we  want  past  e~l  Thus, 

(1  r\ 

T  =  a(t)(A  +  t[A,  B}  +  B  —  t[A,  B}) 
dt 

=  a(t)(A  +  B). 

Now,  according  to  Exercise  2,  the  unique  solution  to  the  differential  equa¬ 
tion  da/dt  =  a(t)(A  +  B)  is  a(t)  =  a(0)et^4+jB\  Since  a(0)  =  /,  we  obtain 
the  desired  result  (14.3).  ■ 

Suppose,  now,  that  A  and  B  are  unbounded  self-adjoint  operators  satis¬ 
fying 

[A,B]=ihI,  (14.6) 

where  the  exponentials  elsA  and  eltB  are  defined  by  means  of  the  spectral 
theorem.  If  we  formally  apply  Theorem  14.1  to  is  A  and  itB  (even  these 
operators  are  unbounded),  we  obtain 

i(sA+tB)  _  isth/2  is  A  itB  _  —isth/2  itB  isA 

G  G  G  G  G  G  G 


so  that 


is  A  itB 

G  G 


—isth  itB  pisA 

G  G  G 


(14.7) 


It  is  essential  to  emphasize  that  the  conclusion  (14.7)  is  only  formal,  since 
it  assumes  that  results  for  bounded  operators  carry  over  to  unbounded 
operators,  which  is  false  in  general.  Nevertheless,  we  may  hope  that  in 
“good”  cases,  self-adjoint  operators  satisfying  (14.6)  will  also  satisfy  (14.7). 

Extending  the  preceding  discussion  to  the  case  of  several  degrees  of  free¬ 
dom  in  an  obvious  way,  we  are  led  to  the  following  definition. 


284 


14.  The  Stone-von  Neumann  Theorem 


Definition  14.2  If  Hi, ... ,  An  and  B i, . . . ,  Bn  are  possibly  unbounded  self- 
adjoint  operators  on  H,  the  A’s  and  B’s  satisfy  the  exponentiated  com¬ 
mutation  relations  if  the  following  relations  hold  for  all  1  <  j,  k  <  n  and 
s,  t  G  M: 


eisAj  eitAk  _eitAkeisAj 
eisBj  eitBk  _eitBkeisBj 
eisAj  eitBk  _  e~isth5jk  eitBk  ^LsAj 


The  operators  elsAj  and  eltBk  are  defined  by  the  spectral  theorem  for  un¬ 
bounded  self-adjoint  operators,  and  they  are  unitary  operators,  defined  on 
all  of  H.  Thus,  when  we  say  that  the  exponentiated  commutation  relations 
hold,  we  mean  that  they  hold  on  the  entire  Hilbert  space  H. 


Notation  14.3  Suppose  operators  Hi, ... ,  An  and  B\, ... ,  Bn  satisfy  the 
exponentiated  commutation  relations.  Then  for  all  a  and  b  in  Mn,  let 
e*(a-A+b-B)  yenofe  the  unitary  operator  given  by 


gi(a- A+b-B)  _  eih(a-h) / 2(  iai  A\  e  #  e  eian  An  ibx  B\  %  m  %  eibnBn 


(14.8) 


Equation  (14.8)  is  nothing  but  what  we  obtain  by  formally  applying 
Theorem  14.1  to  the  operators  za  •  A  and  ih  ■  B  and  then  further  splitting 
the  exponentials  by  formally  applying  (14.1).  The  notation  may  be  further 
justified  by  checking  (Exercise  4)  that  the  operators 


£4,b(t) 


yt2  h(a-h)  / 2  ita±  Ai 


/itCLn,  An 


ytb  i  B] 


Mb 


n  Bn 


(14.9) 


form  a  strongly  continuous  one-parameter  unitary  group.  If  we  then  de¬ 
fine  a  •  A  +  b  •  B  as  the  infinitesimal  generator  (Sect.  10.2)  of  ZT^b,  the 
relation  (14.8)  will  indeed  hold.  Using  the  definition  of  eda  A+b  B)  an(4  the 
exponentiated  commutation  relations,  a  simple  calculation  shows  that 


ei(a-A+b-B)ei(a/-A+b/-B)  _  g  — i?i(a-b/ —  b-a/)/2^i((a+a/) •  A+(b+b/) -B)  ^4  -j^q^ 


In  particular,  e-fia  A+b  B)  is  the  inverse  of  e2(a‘A+b'B),  as  the  notation 
suggests. 

The  following  examples  show  that  in  the  good  case  (the  usual  position 
and  momentum  operators  on  L2(Mn)),  the  exponentiated  commutation  re¬ 
lations  do  hold,  where  as  in  the  bad  case  (the  counterexample  in  Sect.  12.2), 
they  do  not. 


Example  14.4  Let  A j  be  the  usual  position  operator  Xj  acting  on  L2(Mn) 
and  let  Bj  be  the  usual  momentum  operator  Pj .  Then  the  A ; s  and  B  ’ s 
satisfy  the  exponentiated  commutation  relations. 

Proof.  Since  Xj  is  just  multiplication  by  Xj,  it  is  easily  verified  that  elsXj 
is  just  multiplication  by  elsXj .  Meanwhile,  the  exponentiated  momentum 
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operators  satisfy  (Example  10.16) 

[eLtPj  V;)(x)  =  ^(x  +  thej). 

It  is  then  evident  that  ezsXj  commutes  with  eltXk  and  that  ezsPj  commutes 
with  eztPk .  We  may  also  compute  that 

(eitPkeisXj1p)(x.)  =  e«(x+ifte*)^(x  +  thek^ 

_  eisth5Jk 


which  is  what  we  wanted  to  prove.  ■ 


Example  14.5  Let  A  be  the  operator  in  Sect.  12.2  and  let  B  be  the  (unique 
self-adjoint  extension  of)  the  operator  in  that  section.  Then  A  and  B  do 
not  satisfy  the  exponentiated  commutation  relations. 


Proof.  The  operator  A  is  multiplication  by  ay  and  so  the  operator  ezsA 
is  just  multiplication  by  ezsx .  Meanwhile,  the  operator  B  is  —ih  d/dx , 
with  periodic  boundary  conditions.  We  will  now  demonstrate  that  eltB 
consists  of  “translation  with  wraparound.”  Specifically,  for  any  a  £  R  and 
ijj  £  L2([— 1, 1]),  let  us  define  Sa'ip  £  L2([— 1, 1])  by 


(Safj)(x)  =  fj(x  +  a  -  2mXiCL), 


where  mx  is  the  unique  integer  such  that 


-1  <  x  +  a  -  2 mXia  <  1. 

It  is  easy  to  check  that  Sa  is  a  unitary  map  of  L2([0, 1])  for  each  a  £  R. 
We  then  claim  that 

eitB=Sht.  (14.11) 

To  verify  the  correctness  of  (14.11),  observe  that  B  has  an  orthonormal 
basis  of  eigenvectors,  namely  the  functions  fjn(x)  :=  e7™77^,  n  £  Z,  with  the 
corresponding  eigenvalues  being  i xnh.  Thus,  if  we  compute  eltB  by  means 
of  the  spectral  theorem,  we  have 

eitBi(n  =  eninth^n. 

On  the  other  hand, 

(Saifn)(x)(e7rinx)  =  enin^ *+a-2mx,a) 

—  2irinmXta  tv ina  tv inx 
c  c  c 

=  e”inaMx), 

showing  that  eztB  and  Snt  agree  on  each  of  the  functions  n  £  Z,  and 
thus  on  all  of  L2([— 1, 1]). 
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Having  computed  both  e%sA  and  eltB ,  we  may  now  easily  see  that  these 
operators  do  not  satisfy  the  exponentiated  commutation  relations.  We  have, 
for  example,  that 

isA  itB- *  isx 

b  G  X  G  ^ 

whereas 

eitB eisA-^  _  eis(x+th-2mX:a) 

The  function  eW^+^-2m*,a)  js  not  equai  to  elsthelsx  but  rather  to 


^isth^isx  e  —  2ismx,a 


where  e  2lsrn x>a  is  not  always  equal  to  1. 


14.3  The  Theorem 


We  give  two  versions  of  the  Stone-von  Neumann  theorem,  one  for  general 
operators  satisfying  the  exponentiated  commutation  relations  and  one  for 
the  special  case  where  the  operators  act  irreducibly. 


Definition  14.6  Operators  Hi, ... ,  An  and  Hi, ... ,  Bn  satisfying  the  ex¬ 
ponentiated  commutation  relations  are  said  to  act  irreducibly  on  H  if  the 
only  closed  subspaces  of  H  that  are  invariant  under  every  eztAj  and  every 

eitBj  are  |q |  any 

Proposition  14.7  The  usual  position  and  momentum  operators  act  irre¬ 
ducibly  on  L2(Mn). 

We  delay  the  proof  of  this  result  until  near  the  end  of  this  section. 

Theorem  14.8  (Stone-von  Neumann  Theorem)  Suppose  Hi, ... ,  Hn 

and  Hi, ... ,  Hn  are  self-adjoint  operators  on  H  satisfying  the  exponentiated 
commutation  relations.  Then  H  can  be  decomposed  as  an  orthogonal  direct 
sum  of  closed  subspaces  {Vi}  with  the  following  properties.  First,  each  Vi  is 
invariant  under  eltAj  and  eltBj  for  all  j  and  t.  Second,  there  exist  unitary 
operators  Ui  :  Vi  L2(Mn)  such  that 


=  e 


itXj 


and 


=  e 


itPj 


for  all  j  and  t. 

If,  in  addition,  the  A’s  and  B ’s  act  irreducibly  on  H,  then  there  exists  a 
single  unitary  map  U  :  H  -a  L2(] R")  such  that 


U eitAj  f/_1  =  eitXj 
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and 

UeitBjU~1  =  eitp\ 

for  all  t.  The  map  U  is  unique  up  to  multiplication  by  a  constant  of  absolute 
value  1. 

The  preceding  results  can  be  expressed  in  terms  of  the  Heisenberg  group; 
see  Exercise  6. 

Our  strategy  (as  in  von  Neumann’s  1931  paper  [41])  in  proving  Theo¬ 
rem  14.8  is  to  follow  the  outline  of  the  heuristic  argument  in  Sect.  14.1,  but 
replacing  the  unbounded  raising  and  lowering  operators  by  the  bounded 
operators  ez(a'A+b'B)  in  Notation  14.3.  If  we  define  0o  €=  E2(Mn)  by 

0o (x)  =  (7r<r)-n/4e-lxl  ^ 2(J\  (14.12) 

for  some  a  >  o,  then  0o  is  a  unit  vector,  which  we  may  think  of  as  the 
ground  state  of  an  n-dimensional  harmonic  oscillator  with  frequency  uo  = 
h/ (mcr).  We  can  easily  compute  the  Weyl  symbol  of  the  projection  |0o)(0o  I 
onto  0o  as  follows: 


/o(x,p)  :=  Qweyld^oX^ol)  =  2 


"e-|X| 


y<Te-<Tipi 


:/n- 


(14.13) 


(See  Exercise  9  in  Chap.  13). 

We  may  define  a  generalized  Weyl  quantization  Q  for  H  by  using  the  op¬ 
erators  e2(a'A+b'B)  in  place  of  the  operators  eha  X+b  p)  jn  (13.17).  We  will 
show  that  the  operator  P  :=  Q(fo)  is  an  orthogonal  projection,  and  we  will 
take  W  :=  Range(P)  as  our  space  of  ground  states  in  H.  A  crucial  result 
will  be  that  the  projection  P  is  nonzero  and,  indeed,  that  the  restriction 
of  P  to  any  nonzero  subspace  invariant  under  the  eda  A+b  B)’s  js  nonzero. 

If  {0^}  is  an  orthonormal  basis  for  W,  consider  the  vectors 


0 lab  :=  eTa-A+b-By. 


We  will  show  that  these  vectors  are  orthogonal  for  different  values  of  Z, 
and  that  for  fixed  Z,  the  inner  product  of  two  such  vectors  is  the  same 
as  in  the  L2(Mn)  case.  Thus,  if  Vi  denotes  the  closed  span  of  the  0ai0s 
with  l  fixed  and  a  and  b  varying,  we  can  construct  a  unitary  map  from 
Vi  to  L2(Mn)  that  intertwines  the  operators  eha  A+b  B)  with  the  operators 
C(a-x+b-P).  The  sum  of  the  Vfs  must  be  all  of  H,  for  if  not,  the  orthogonal 
complement  Y  of  the  span  would  be  invariant  under  the  ehaA+bB)’s>  Thus, 
the  restriction  of  P  to  Y  would  be  nonzero,  implying  that  there  are  elements 
of  W  :=  Range(P)  orthogonal  to  every  0Z,  contradicting  the  assumption 
that  the  0^’s  span  W. 

The  rest  of  this  section  will  flesh  out  the  argument  sketched  in  the  pre¬ 
ceding  paragraphs. 
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Definition  14.9  Suppose  self-adjoint  operators  Ai, . . . ,  An  and  B\, . . . ,  Bn 
satisfy  the  exponentiated  commutation  relations  on  H.  For  any  f  E  <S(M2n), 
define  Q(f)  G  B( H)  by  the  formula 

Q(f)  =  (27r)_n  f  /(a,  b)e^a'A+b'B^  da.  dh, 

Jr2™  ' 

where  f  is  the  Fourier  transform  of  f  and  where  eha,A+b,B)  is  as  {n 
Notation  14-3.  The  integral  is  a  Bochner  integral  with  values  in  the  Ba¬ 
nach  space  B(H). 

We  will  assume  the  following  standard  properties  of  the  Bochner  integral 
(Sect.  V.5  of  [46]).  First,  any  continuous  function  /  :  M2n  — >  6(H)  for  which 
f  ||/(x)||  dx  <  oc  has  a  well-defined  Bochner  integral.  Second,  the  Bochner 
integral  commutes  with  applying  bounded  linear  transformations.  Third,  a 
version  of  Fubini’s  theorem  holds. 

Proposition  14.10  For  any  operators  satisfying  the  exponentiated  com¬ 
mutation  relations,  the  associated  map  Q  in  Definition  lj.9  has  the  follow¬ 
ing  properties. 

1.  If  f  E  <S(M2n)  is  real  valued,  Q(f )  is  self-adjoint. 

2.  For  all  a  and  b  in  Mn  and  f  E  6(Mn),  we  have 

ei(  a.A+b.B)g(/)  =  Q(//) 

Q(/K(aA+bB)  =  Q(f"), 

where  f  and  f"  are  the  functions  with  Fourier  transforms  given  by 

/'(a',  b')  =  eifi(a'  b-a  b')/2  /(a'  -  a,  b'  -  b) 

/"(a',  b')  =  e^(a'-b“a'b')/2/(a'  _  a,  b'  -  b) 

3.  For  all  f  and  g  in  <S(R2"),  we  have 

Q(f)Q(g)  =  Q(f*g), 

where  ★  is  the  Moyal  product  described  in  Proposition  13.9. 

4.  For  all  f  E  <S(Mn),  if  Q(f)  =  0  then  /  =  0. 

Using  both  parts  of  Point  2  of  the  theorem,  we  can  see  that  for  all 
a,  b  E  Mn ,  we  have 

e-i(a.A+b-B)Q^ei(a.A+b.B)  = 


where 


g(  a',b')  =  eiR(a'b-a'b')/(a',b')- 


(14.14) 
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Proof.  For  Point  1,  we  can  re-express  Q(f)  as 


1 

2 


/( a,b)e 


i(a- A+b-B) 


+  /(- a, -b)e 


i(a- A+b-B) 


da  db, 


since  the  change  of  variable  a7  =  —a,  b7  =  —  b  brings  the  second  term 

equal  to  the  first  term.  If  /  is  real  valued,  then  /(—a,  — b)  is  the  conjugate 

/\ 

of  / (a,  b),  so  that  the  expression  in  square  brackets  in  the  integral  is  self- 
adjoint  for  each  (a,  b). 

For  the  first  part  of  Point  2,  we  use  (14.10)  to  obtain 


ei(a-A+b-B  )q(j) 


ih(a-h' -h-a')/2 fo7 jgF(a+a/) •  A+(b+b') B) 


da7  db7. 


Making  the  change  of  variables  a77  =a7  +  a  and  b77  =  b7-fb  and  simplifying 
gives  the  desired  result.  The  proof  of  the  second  part  of  Point  2  is  similar. 

The  proof  of  Point  3  is  precisely  the  same  as  the  proof  of  Proposition  13.9, 
which  relies  only  on  the  exponentiated  commutation  relations. 

For  Point  4,  suppose  that  Q(f)  =  0  for  some  /  E  cS(M2n).  Then  for  all 
and  all  a,  b  E  Mn,  we  have 


ei(a-A+b-B)^  Q(/)ei( a-A+b-B)  A 
(f>,  e“i(a'A+b'B)Q(/)ei(a'A+b'BV 


=  (<j>,Q(g)ip) 


where  g  is  as  in  (14.14).  Thus, 


0  =  j  e«(a''b-a-b')/(a'jb')  ^gita'.A+b'.B)^  ^  ^  (14.15) 

for  all  </>,  ^  and  a,  b.  But  (14.15)  is  just  computing  the  inverse  Fourier 
transform  of  the  function  /(a7,  b7)((/>,  e^a  A+b  ‘B^),  evaluated  at  the  point 
(—a,  b).  By  the  Fourier  inversion  formula,  then,  this  function  must  be  zero 
for  almost  every  pair  (a7,b7).  Now,  the  function  (0,  e^a 'A+b -B js  a 
continuous  function  of  (a,  b)  and  by  taking  <f>  =  ehao-A+b0-B)^  q  can  pe 

made  to  be  nonzero  at  any  given  point  (a0,bo)  in  M2n,  and  thus  also  in 

/\ 

a  neighborhood  of  that  point.  Thus,  actually,  /  is  identically  zero  and  so 
also  is  /.  ■ 

Lemma  14.11  Let  /o  be  the  function  on  M2n  given  by 

/o(x,p)  =  2ne-|x|2/<Je-<Jlpl  2/n\ 


where  cr  is  a  fixed  positive  number.  Then  for  all  a,  b  E  Mn,  we  have 


Q(/o)ei(aA+bB)Q(/o)  =  e 


_  ^  — <7|a|2/4  — ?i2|b|2/ (4<x) 


QU o). 


(14.16) 
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In  particular, 

QUo)2  =  Q(fo )■ 

Proof.  By  Proposition  14.10,  (14.16)  is  equivalent  to  the  assertion  that 

/o*/o  =  e-CT|a|2/4e-ft2lbl2/(4CTy0.  (14.17) 

Now,  it  is  certainly  possible  to  establish  (14.17)  by  direct  computation  from 
the  definitions  of  /q  and  ★;  ah  the  integrals  involved  will  be  Gaussian  inte¬ 
grals,  which  can  be  evaluated  by  means  of  Proposition  A. 22.  This  approach, 
however,  is  both  painful  and  unilluminating.  A  more  sensible  approach  is 
to  observe  that  is  suffices  to  verify  (14.16)  for  the  ordinary  Weyl  quantiza¬ 
tion  on  L2(Mn).  After  all,  (14.16)  is  equivalent  to  (14.17),  which  in  turn  is 
equivalent  to  the  identity 

Qweyl(/o)e^a’X+b‘P^Qweyl(/o) 

=  e-<7lal2/4e_ft2|b|2/(4<T)Qweyi(/o),  (14.18) 


by  applying  Proposition  14.10  in  the  case  Q  =  Q weyl- 

Now,  by  Exercise  9  in  Chap.  13,  Qweyi(/o)  is  the  one-dimensional  pro¬ 
jection  1 0o )( 0o  |  ,  where  0o(x)  =  (7ra)-n/4e-lxl  Thus, 


Qweyl(/o)e^a'A+b'B^Qweyl(/o) 


00 X 00 1  el(a'x+b’p) 
c  1 00  )(0O  |  • 


00  )(  0o  | 

(14.19) 


where 

C=  (0o|ei(a'x+b'p)|0o). 
To  compute  c,  we  use  (13.20),  which  gives 


C  =  (7 ra)-n/Vfi(a'b)/2  /  e-NI2/(2^)e»a'xe-|x+fib!2/(2 <x)  dx  (14.20) 


)  n 


2 

The  integral  in  (14.20)  can  be  computed  by  expanding  |x  +  hb\  ,  collecting 
terms  in  the  exponent,  and  applying  Proposition  A. 22.  The  result,  after  a 
bit  of  algebra,  is 


c 


—  *>-<dal  /4^-^lbl  /(4cr) 


which  gives  (14.18).  ■ 

We  now  prove  the  claimed  irreducibility  of  the  usual  position  and  mo¬ 
mentum  operators. 

Proof  of  Proposition  14.7.  Given  operators  A i, . . . ,  An  and  . . . ,  Bn 
satisfying  the  exponentiated  commutation  relations,  consider  the  operator 
Q(/o),  where  /o  is  as  in  (14.13).  According  to  Lemma  14.11,  Q(/o)2  = 
Q(/o).  Since  also  /o  is  real  valued,  Q(fo)  is  self-adjoint  and  thus  an  orthog¬ 
onal  projection.  Suppose  that  the  range  of  the  orthogonal  projection  Q(fo) 
is  one-dimensional.  We  then  claim  that  the  A  ’s  and  TT s  act  irreducibly.  If 
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not,  there  would  exist  a  nontrivial  closed  subspace  V  that  is  invariant  un¬ 
der  each  of  the  operators  eha  A+b  B)  rppen  the  nonzero  subspace  V1-  would 
also  be  invariant  under  each  of  the  operators  (e^(a'A+b-B)j*  _  e-haA+bB) 
Thus,  the  exponentiated  commutation  relations  are  satisfied  in  both  V  and 
V7  with  the  A7 s  and  B7  s  being  the  infinitesimal  generators  of  the  restric¬ 
tions  of  eltAj  and  eltBj  to  each  subspace. 

It  follows  that  the  restriction  of  Q(fo)  1°  each  of  these  subspaces  may  be 
thought  of  as  the  generalized  Weyl  quantizations  for  V  and  V1-  of  the  func¬ 
tion  /q.  Applying  Point  4  of  Proposition  14.10  to  V  and  to  R7  we  conclude 
that  the  restrictions  of  Q(fo)  to  V  and  to  VA  are  nonzero.  Thus,  both  V 
and  V1-  will  contain  nonzero  elements  of  Range(Q(/o)),  contradicting  our 
assumption  that  Range(Q(/o))  is  one  dimensional. 

In  case  of  L2(Mn),  we  have  Qweyi(/o)  =  |0o)(0o|?  where  <p o  is  given 
by  (14.12),  which  clearly  has  a  one-dimensional  range.  Thus,  the  usual 
position  and  momentum  operators  act  irreducibly  on  L2(Mn).  ■ 

We  are  finally  ready  for  the  proof  of  the  Stone-von  Neumann  theorem. 
Proof  of  Theorem  14.8.  Let  W  =  Range(Q(/o)),  where  /o  is  given 
by  (14.13)  for  some  fixed  a  >  0.  For  </>,  7  E  W,  we  can  use  (14.10), 
Lemma  14.11,  and  the  fact  that  Q(fo)  is  the  identity  on  W  to  obtain 


i(a- A+b-B)  ±  £(a'-A+b'-B) 


7 


Q{fo)<l>,e 


-i(a- A+b-B)  i(a; •  A+b7 -B) 


Q(/oW 


_  ih(a-h' —  b-a7)/2 


e"~'~  ~  - <  <A,  Q(/o)ei((a'-a)'A+(b'-b)'B)Q(/o)^ 


ih(a-h'  —  h-a.')/2  —  cr 

G  G 


a  —a 


2/4e-fi2|b'-b|2/(4<r) 


(<t>,  >  • 


(14.21) 


Now  let  {V’  }  be  an  orthonormal  basis  for  W  and  define  vectors  'ijJa  b, 
a,  b  E  Mn,  by 

T.b  =e*(a’A+b'By'. 

By  (14.21),  7^  h  is  orthogonal  to  ipla,  b,  whenever  l  7  V .  Furthermore, 


(7i,b>  7i',b7) 


ih(a-W  —  b-a/)/2  - 

G  G 


a 


a 


/A-h2  b'-b  7(4 a) 


(14.22) 


where  the  right-hand  side  of  (14.22)  is  “universal,”  that  is,  independent  of 
l  and  independent  of  the  particular  Hilbert  space  in  which  we  are  working. 

Let  Vi  be  the  closed  span  of  the  vectors  7i  h  with  l  fixed  and  a,  b  varying. 
We  may  define  a  map  Ui  :  Vi  L2(Mn)  by  requiring  that 


N 

^^7  a7,b?; 
3  = 1 


for  every  sequence  ai, . . . ,  ajv  and  bi, . . . ,  bjv  of  vectors,  where 
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This  map  is  isometric  by  (14.22)  on  linear  combinations  of  the  i/^b’s  and 
thus  extends  uniquely  to  an  isometric  map  of  Vi  into  L2(Mn).  [In  particular, 
Ui  is  well  defined:  If  some  linear  combination  of  ^b’s  zero?  then  this 
linear  combination  has  norm  zero  and  so  its  image  under  Ui  also  has  norm 
zero  and  is  thus  zero  in  L2(Mn).] 

Now,  Vi  is  invariant  under  the  operators  eda  A+b  B)  by  (14.10),  and,  simi¬ 
larly,  the  image  of  Vi  under  Ui  is  invariant  under  the  operators  eda  X+b  Pd 
By  the  irreducibility  of  L2(Mn)  (Proposition  14.7),  we  conclude  that  Vi 
maps  onto  L2(Mn)  and  is,  therefore,  unitary.  Furthermore,  using  (14.10)  and 
the  analogous  expression  (13.31)  for  the  position  and  momentum  operators, 
it  is  easy  to  check  that  each  Ui  intertwines  eda-A+b  B)  with  eda-A+b,B)5  for 
all  a,  b  G  Mn.  In  particular,  taking  either  a  =  te3  and  b  =  0  or  a  =  0  and 
b  =  tej  we  see  that  Ui  intertwines  eltAj  with  eltXj .  Similarly,  Ui  intertwines 
eltBj  with  eltPj . 

We  now  argue  that  the  Hilbert  space  direct  sum  of  the  orthogonal  sub¬ 
spaces  Vi  is  all  of  H.  If  not,  then  as  in  the  proof  of  Proposition  14.7,  the 
orthogonal  complement  Y  of  this  sum  would  be  invariant  under  the  oper¬ 
ators  eha  A+b  B)  an(4  thus  also  under  the  operator  Q(fo)-  Furthermore,  as 
in  the  proof  of  Proposition  14.7,  the  restriction  of  Q(fo)  to  Y  would  be 
nonzero.  Thus,  there  would  exist  elements  of  W  =  Range(Q(/o))  orthogo¬ 
nal  to  each  ip1 ,  contradicting  the  assumption  that  the  i/^’s  span  W. 

It  remains  only  to  address  the  irreducible  case.  If  the  A7 s  and  B7 s  act 
irreducibly,  then  there  can  be  only  one  subspace,  V\  =  H,  which  means 
that  W  must  be  one  dimensional.  Any  unitary  map  U  :  H  — L2(Mn)  that 
intertwines  each  operator  eda  A+b  B)  with  eda  X+b  p)  must  also  intertwine 
each  operator  of  the  form  Q(f)  with  Qweyi(f)-  If  follows  that  U  must  map 
the  one-dimensional  subspace  W  unitarily  onto  the  one-dimensional  range 
of  Qweyi(/o)  =  |</>o)(0o|  •  Thus,  the  restriction  of  U  to  W  is  unique  up  to  a 
constant  of  absolute  value  1.  But  the  reasoning  leading  to  the  existence  of 
U  shows  that  U  is  determined  by  its  action  on  IP,  so  the  entire  map  U  is 
unique  up  to  a  constant.  ■ 


14.4  The  Segal-Bargmann  Space 

A  simple  example  of  the  Stone-von  Neumann  theorem  is  provided  by  the 
Hilbert  space  H  :=  L2(Mn),  together  with  the  operators  Aj  :=  P3 ,  and 
Bj  :=  —Xj.  In  that  case  (Exercise  3),  the  unitary  map  U  in  the  Stone-von 
Neumann  theorem  will  simply  be  a  scaled  version  of  the  Fourier  transform, 
as  in  Definition  6.1.  To  obtain  a  more  interesting  example,  we  construct  a 
Hilbert  space  consisting  of  holomorphic  functions  on  Cn. 
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14-4-1  The  Raising  and  Lowering  Operators 

A  smooth  function  on  F  :  Cn  —>  C  is  said  to  be  holomorphic  if  it  is 
holomorphic  as  a  function  of  z3  with  the  other  zR s  fixed.  Equivalently,  F 
is  holomorphic  if  OF / dzj  =  0,  where 


_d_ 

dzj 

The  operator 

<9  l/<9  .  9  \ 

dzj  *  2  y  dxj  1  dyj  ) 

preserves  the  space  of  holomorphic  functions  on  Cn. 

Considered  the  operators  Zj  (i.e.,  multiplication  by  Zj)  and  h  d/dzj , 
acting  on  the  space  of  holomorphic  functions  on  Cn.  Fock  [9]  observed  that 
these  operators  satisfy  the  following  commutation  relations: 


l/<9  .  9  \ 

2  +%  dyj) 


h  d 

n——,zk 

OZn 


—  hSjki 


o 


(14.23) 


These  are  essentially  the  same  commutation  relations  as  the  raising  and 
lowering  operators  considered  in  Sect.  11.2.  Specifically,  (14.23)  are  the  re¬ 
lations  that  would  be  satisfied  by  the  natural  higher-dimensional  analogs 
of  the  operators  a  and  a*  in  that  section  if  we  omitted  the  factor  of  \[h  in 
the  denominator  in  (11.4)  and  (11.5). 

Now,  if  we  wish  to  interpret  the  operators  Zj  and  h  d/dzj  as  raising  and 
lowering  operators,  then  we  should  look  for  an  inner  product  on  the  space 
of  holomorphic  functions  that  would  make  these  two  operators  adjoints 
of  each  other.  After  all,  the  analysis  in  Chap.  11  strongly  depends  on  the 
assumption  that  a  and  a*  are  adjoints  of  each  other.  In  the  early  1960s, 
Segal  [36]  and  Bargmann  [2]  identified  such  an  inner  product.  Once  we  have 
described  this  Segal-Bargmann  inner  product,  we  will  construct  self-adjoint 
“position”  and  “momentum”  operators  as  appropriate  linear  combinations 
of  Zj  and  h  d/dzj.  We  will  then  verify  the  exponentiated  commutation 
relations  and  irreducibility,  allowing  us  to  apply  the  Stone-von  Neumann 
theorem. 

We  look  for  an  L 2  inner  product  with  respect  to  a  measure  having  a 
positive  density  with  respect  to  the  Lebesgue  measure  on  Cn. 


Lemma  14.12  Suppose  that  g  is  a  smooth,  strictly  positive  density  on  Cn 
and  that  F  and  G  are  sufficiently  nice  (but  not  necessarily  holomorphic) 
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functions  on  Cn.  Then 

[  F(z)  dz 

JCn  C,Zj 

=  —  f  ^r^G(z)y(z)  dz  —  f  ^  —F(z)  G(z)  dz ,  (14.24) 

Jcn  VZj  J Cn  OZj 

where  dz  denotes  the  2n- dimensional  Lebesgue  measure  on  Cn  =  M2n. 
Equation  (14.24)  tells  us  that 

/  9  V  d  dlogy 

\9zj)  dzj  dzj 

where  the  adjoint  is  computed  with  respect  to  the  inner  product  for  the 
Hilbert  space  L2(Cn,/i).  If  we  restrict  the  adjoint  operator  ( d/dzj )*  to 
the  space  of  holomorphic  functions,  then  the  d/dzj  term  is  zero,  by  the 
definition  of  a  holomorphic  function. 

Proof.  Let  us  approximate  the  integral  over  Cn  on  the  left-hand  side 
of  (14.24)  by  an  integral  over  a  large  cube.  By  performing  either  the  Xj- 
integral  or  the  i/j -integral  first,  we  can  integrate  by  parts  to  push  the  deriva¬ 
tives  with  respect  to  Xj  or  yj  off  of  G  and  onto  the  product  of  F  and  fa 
(with  a  minus  sign).  The  boundary  term  in  the  integration  by  parts  will 
involve  the  function  F(z)G{z)y(z)  integrated  over  two  opposite  faces  of 
the  cube.  If  this  function  tends  to  zero  sufficiently  rapidly  at  infinity,  the 
boundary  terms  will  vanish  in  the  limit.  In  that  case,  we  obtain 

f  -Oz)|— Mz)  dz 

j£n  OZj 

=  ~  f  (^-F(z))  G(z)Mz)  dz-  f  F(z)G(z)2f  dz, 

j£n  \CJZj  J  j£n  OZj 

provided  that  all  three  of  the  above  integrals  are  absolutely  convergent. 
Since  dF/dzj  =  dF/dzj  and 


dfi 

dzj 


d  log  fi 
dzj 


d  log  /a 
dzj 


1 


we  obtain  (14.24).  ■ 

We  now  look  for  a  density  fin  for  which  cHo g/a/dzj  =  —Zj/h.  In  that 
case,  the  adjoint  operator  ( d/dzj )*  preserves  the  holomorphic  subspace  of 
L2(Cn,/ifr)  and  is  given  on  this  subspace  by  multiplication  by  Zj/h. 


Lemma  14.13  Specialize  Lemma  14-12  to  the  case  in  which  F  and  G  are 
holomorphic  polynomials  and  y  is  the  density  given  by 


dh(z)  = 


(14.25) 
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Then  we  have 


l*  dG  1  f 

/  F(z)— — fJ>n(z)  dz  =  -  /  ZjF(z)G{z)nh(z)  dz.  (14.26) 

Jcn  C,zj  ft  J Cn 

Proof.  In  the  case  that  F  and  G  are  holomorphic  polynomials,  dF/dzj  =  0, 
so  the  first  term  on  the  right-hand  side  of  (14.24)  is  zero.  Furthermore,  FGfi 
decreases  rapidly  at  infinity  and  so  the  boundary  terms  vanish  in  this  case. 
Finally,  we  may  compute  9 log  fih/dzj  as  — Zj/h ,  giving  (14.26).  ■ 

Definition  14.14  The  Segal-Bargmann  space,  denoted  TLL2  (Cn ,  gn)  is 

the  space  of  holomorphic  functions  F  on  Cn  for  which 


F 


h 


1/2 


dz 


<  00, 


where  gn  is  as  in  (14-25).  Define  raising  and  lowering  operators  a*  and 
aj  on  TLL2  (Cn ,  gn)  by 


dj 

with  the  domain  of  aj  and  a*  consisting  of  the  space  of  holomorphic  poly¬ 
nomials. 

In  light  of  Lemma  14.13,  the  operators  aj  and  a*  satisfy 

(F,  ajG)HL2(Cn^hj  =  \a*jFW)HL2(cn^h) 

for  all  holomorphic  polynomials  F  and  G ,  thus  justifying  the  notation  a* 
for  the  raising  operator.  The  space  TLL2(Cn ,  gri)  is  also  sometimes  called 
the  Fock  space.  It  should  be  noted,  however,  that  in  quantum  field  the¬ 
ory,  the  term  Fock  space  also  refers  to  a  different  (but  related)  space — the 
completion  of  the  tensor  algebra  over  a  fixed  Hilbert  space. 

Proposition  14.15  The  Segal-Bargmann  space  is  complete  with  respect 
to  the  norm  ||-||^  and  forms  a  Hilbert  space  with  respect  to  the  associated 
inner  product, 

{ F,G)h:=  I  F{z)G(z)nh{z)  dz. 

J  o 

Furthermore,  the  space  of  holomorphic  polynomials  forms  a  dense  subspace 
of  the  Segal-Bargmann  space. 

Note  that  elements  of  l~LL2{Gn ,  gn)  are  actual  functions  on  Cn,  not  equiv¬ 
alence  classes  of  functions.  Nevertheless,  we  can  regard  HL2(Cn ,  as  a 


=  Z3 


=  h 


d 


dz,' 
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subspace  of  L2(Cn,/i^),  since  each  equivalence  class  of  almost-every where 
equal  functions  contains  at  most  one  holomorphic  representative. 

Proof.  Given  any  zo  G  Cn  and  R  P*  0}  let  PZq,r  denote  the  polydisk  given 

by 


{z  G  Cn 


Zj  -  (z0)j\  <  R,  j  =  1, . . .  ,n}  . 


Using  a  power- series  argument,  it  is  easy  to  show  that  the  value  of  a  holo¬ 
morphic  function  F  at  zo  is  equal  to  the  average  of  F  over  PZq,r •  We  can 
then  multiply  and  divide  by  fin  to  obtain 


= Tf  L  w(z)  *■ 

The  Cauchy-Schwarz  inequality  then  tells  us  that 


^(zo) 


< 


1 

(7 \R2)n 


L2(Cnltjth) 


L2(Cn * 


(14.27) 


This  inequality  tells  us  that  pointwise  evaluation  [the  map  F  1— >>  F( z0)]  is 
a  bounded  linear  functional  on  the  Segal-Bargmann  space. 

Suppose  now  that  Fn  is  a  sequence  of  holomorphic  functions  such  that 
Fn  converges  in  L2(Cn,/i^)  to  some  F.  Using  (14.27),  we  can  easily  show 
that  Fn  converges  to  F  uniformly  on  compact  sets,  which  implies  that  F  is 
also  holomorphic.  This  shows  that  the  holomorphic  subspace  of  L2(Cn,  fin) 
is  closed  and  hence  is  a  Hilbert  space. 

To  show  the  denseness  of  polynomials,  consider  some  F  E  7/L2(Cn,/i^) 
and  let 

F(z)=^anzn  (14.28) 


n 


be  the  Taylor  expansion  of  U,  where  n  ranges  over  all  multi-indices.  This 
series  converges  to  F  uniformly  on  compact  subsets  of  Cn.  We  claim  that 
the  terms  in  (14.28)  are  orthogonal.  To  see  this,  use  Fubini’s  theorem  to 
perform  the  integration  of  zn  against  zm  one  variable  at  a  time.  Using 
polar  coordinates  in  each  copy  of  C,  we  can  see  that  we  will  get  zero  if  the 
power  of  Zj  in  zn  is  not  the  same  as  the  power  of  Zj  in  zm. 

Since  it  is  orthogonal,  the  series  in  (14.28)  will  converge  in  L2(Cn,/i^) 
provided  that  the  sum  of  the  squares  of  the  norms  of  the  terms  is  finite.  If 
Po,r  is  a  sequence  of  polydisks  of  increasing  radius  centered  at  the  origin, 
the  argument  in  the  preceding  paragraph  shows  that  the  terms  in  (14.28) 
are  orthogonal  in  L2(Pq,.r?  H>h)-  Since  the  series  converges  uniformly  on 
we  can  then  interchange  sum  and  integral  to  obtain 


1 

2 

n 

|un 

Z 

2 

L2(Po,R,^h) 


2 

L2(Po,R,Vh) 


n 
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By  applying  monotone  convergence  to  both  the  sum  over  n  and  the  integrals 
over  Po,i?7  we  may  let  R  tend  to  infinity  to  obtain 


2 

LR  Cn,tJLh) 


L2( <  00 


Thus,  the  series  in  (14.28)  converges  in  L2  (C" ,  p,n)  and  this  I  2  limit  must 
coincide  with  the  pointwise  limit,  namely  F  itself.  ■ 


14-4- %  The  Exponentiated  Commutation  Relations 

To  apply  the  Stone- von  Neumann  theorem  to  the  Segal-Bargmann  space, 
we  define  self-adjoint  “position”  and  “momentum”  operators  as  follows: 


Al  ~  75  (Zj  +  T) 

B‘ =  75  iz‘ ~  h0h) ' 

We  will  identify  one-parameter  unitary  groups  having  (extensions  of)  these 
operators  as  their  infinitesimal  generators,  which  will  show  (by  Stone’s 
theorem)  that  the  generators  are  indeed  self-adjoint  on  suitable  domains. 
We  will  then  verify  the  exponentiated  commutation  relations  and  check 
irreducibility. 

Let  us  compute  heuristically  and  then  check  that  our  results  are  correct. 
If  we  formally  apply  Theorem  14.1  to  the  (unbounded)  operators  22 ^jzj 
and  —hJ2ajd / dzj>  we  obtain 


exp 


(14.29) 


This  calculation  suggests  that  we  define  operators  Ta  by  the  formula 

(TaF)(z)  =  e“fi|a|2/2e“5-zF(z  +  ha),  a  G  C",  (14.30) 


where  for  any  a,  b  E  Cn,  we  define  a-b  =  22 j  aj^j  (no  complex  conjugates). 
Since  the  exponent  on  the  left-hand  side  of  (14.29)  is  skew- self- adjoint  (the 
difference  of  an  operator  and  its  adjoint),  we  expect  the  operators  Ta  to 
be  unitary.  For  suitable  choices  of  a,  the  operator  on  the  left-hand  side 
of  (14.29)  will  become  the  one-parameter  group  generated  by  Aj  or  Bj. 


298 


14.  The  Stone-von  Neumann  Theorem 


Theorem  14.16  For  each  a  £  Cn,  the  operator  Ta  defined  by  (If. 30)  is 
a  unitary  operator  on  the  Segal-Bargmann  space,  and  the  map  a  Ta  is 
strongly  continuous.  These  operators  satisfy 

TaTb  =  eifiIm(ab)Ta+b.  (14.31) 

In  particular,  for  each  j,  the  maps 


uj(t) Titej/V2 ;  W) Ttej/V2 

are  strongly  continuous  one-parameter  unitary  groups.  The  infinitesimal 
generators  Aj  and  Bj  of  these  groups  satisfy  the  exponentiated  commutation 
relations. 

For  any  F  £  Dom (Aj),  we  have 


(AjF)(z) 


1  ( 


y/2\ 


ZjF{  z)  +  h 


dF 

dZn 


and  for  any  F  £  Dom  (Bj),  we  have 


(BjF)(  z) 


V2 


(zjF( z)  -  h 


dF\ 

dzj) 


Furthermore,  the  domains  of  Aj  and  Bj  contain  all  holomorphic  polyno¬ 
mials. 

Finally,  the  operators  Aj  and  Bj  act  irreducibly  on  the  Segal-Bargmann 
space,  in  the  sense  of  Definition  If. 6. 


Proof.  It  is  evident  that  TaF( z)  is  holomorphic  as  a  function  of  z  for  each 
fixed  a.  Meanwhile,  for  any  F  £  FLL2(Cn,  /r^),  we  have 


2 

£2(C-,aoO 


-n 


(7 xh)  I  e 

Cn 


■fc|a| 


e  — 2Re(a.z)  | F(z  +  fta) |2  e~^ ^  dz 


(7 Th)~n  [  e~\z+ha\2 /h  |_p(z  +  ha)\2  dz 


F 


L2(  <C",/z*) 


showing  that  Ta  is  isometric.  The  formula  for  TaTb  follows  from  direct 
computation  (Exercise  7),  and  from  this  formula  we  see  that  TaT_a  =  /, 
which  shows  that  Ta  is  surjective  and  thus  unitary.  The  strong  continuity 
of  Ta  is  easily  verified  on  polynomials  (Exercise  8),  which  are  dense  in  the 
HL2(Cn,fih). 

It  easily  follows  from  (14.31)  that  Uj(-)  and  Vj(-)  are  one-parameter  uni¬ 
tary  groups,  and  also  that  (the  infinitesimal  generators  of)  these  unitary 
groups  satisfy  the  exponentiated  commutation  relations.  If  F  is  in  the  do¬ 
main  of  the  infinitesimal  generator  of  Uj(-),  the  limit 


(AjF)(z)  :=  -  lim  - 
v  0  A  7  i  t 


ht2/4jtZj/V2F(z  +  itfte.fy/2)  -  F( z) 


(14.32) 
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must  exist  in  L2(Cn ,  fin)-  The  L2  limit  coincides  with  the  easily  computed 
pointwise  limit,  giving 


AjF(  z) 


1 

i 


zjF{  z)  + 


ih  dF\ 

y/2  dZj  ) 


5 


as  claimed.  If  F  is  a  polynomial,  it  is  easily  shown,  using  dominated  con¬ 
vergence,  that  the  limit  in  (14.32)  exists  in  L2(Cn,/i^).  The  analysis  of  Bj 
is  similar. 

Finally,  we  address  irreducibility.  If  the  s  and  s  did  not  act  ir- 
reducibly,  then  in  the  application  of  the  Stone-von  Neumann  theorem  to 
7/L2(Cn,  fin)-,  there  would  exist  at  least  two  subspaces  Vb  Thus,  there  would 
exist  at  least  two  linearly  independent  vectors  Fi  such  that  for  all  j,  we  have 
that  Fi  is  in  the  domain  of  A:)  and  B}  and 


,  ,  „  x  „  2  hdFi 

(Take  Fi  to  be  the  preimage  under  Ui  of  the  function  <fi o  in  (14.12),  with  a  = 
h.)  This  would  mean  that  each  Fi  is  constant,  contradicting  the  assumption 
that  the  F/’s  are  linearly  independent.  ■ 

14-4' 3  The  Reproducing  Kernel 

According  to  (14.27),  evaluation  of  F  E  7~LL2 (Cn ,  fin)  at  a  fixed  point  z  is 
a  continuous  linear  functional.  Thus,  this  linear  functional  can  be  written 
as  the  inner  product  with  a  unique  element  yz  of  l~LL2(Cn ,  fin),  which  we 
now  compute.  The  vector  yz  is  called  the  coherent  state  with  parameter  z. 

Proposition  14.17  For  all  F  E  /HL2(Cn,  fin),  we  have 

F(z)  =  f  ez'w/^F(w)/ifr(w)  dw.  (14.33) 

Jcn 

The  function  ez'W^  is  called  the  reproducing  kernel  for  7{L2(Cn,  fin), 
since  integration  against  this  kernel  simply  gives  back  (or  “reproduces”) 
the  function  F.  Of  course,  the  relation  (14.33)  holds  only  for  holomorphic 
functions  in  L2(Cn ,  fin).  Equation  (14.33)  can  be  rewritten  as 

^(Z)  =  (Xz,  F)nL2(Cn^h^ 


where 

Xz(w)  =  e^/h. 

Proof.  We  begin  by  establishing  the  result  in  the  case  z  =  0.  We  have 
already  established,  in  the  proof  of  Proposition  14.15,  that  the  Taylor  series 
of  F  converges  to  F  in  7/L2(Cn,  fin)-,  and  the  distinct  monomials  in  this 
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series  are  orthogonal.  Thus,  when  computing  (1,  F)nL2^Cn  only  the 
constant  term  in  the  expansion  of  F  survives,  giving 

(1,  F)uL2(cn,iih)  =  T(0)  1)'KL2(c^,/ift)  =  ^(0)’  (14.34) 

since  gn  is  a  probability  measure.  But  this  relation  is  precisely  the  z  =  0 
case  of  (14.33). 

Let  us  now  apply  (14.34)  to  TaF,  where  Ta  is  the  unitary  operator 
in  (14.30).  According  to  Theorem  14.16,  Ta  is  unitary  with  inverse  equal 
to  T_a,  giving 

(^a^)(0)  =  (l,TaF)HL2(Cn^h}  =  (T-al,  ^)nL2{Cn^h)  * 

Writing  this  relation  out  using  w  as  our  variable  of  integration  gives 

e-fr|a|  F =  J  e_^lal2/2ea‘wT1(w)/i^(w)  dw. 

Setting  a  =  z/h  and  simplifying  gives  the  desired  result.  ■ 

14-4-4  The  Segal-Bargmann  Transform 

Since  the  operators  Aj  and  B:)  in  Theorem  14.16  satisfy  the  exponentiated 
commutation  relations  and  act  irreducibly  on  TLL2( Cn,  /x^),  the  second  part 
of  the  Stone-von  Neumann  theorem  tells  us  that  there  is  a  unitary  map 
U  :  7~LL2( Cn,  fin)  -4  L2(Mn),  unique  up  to  a  constant,  that  intertwines  these 
operator  with  the  usual  position  and  momentum  operators.  The  inverse 
map  V  :  L2(Rn)  -4  FL2 (Cn ,  gn)  is  called  the  Segal-Bargmann  transform. 

Theorem  14.18  Let  V  be  the  inverse  of  the  map  U  :  FL2{ Cn,/i^)  -4 
L2(Mn)  given  by  the  Stone-von  Neumann  theorem,  normalized  so  that  V 
takes  the  function  </>o  G  L2(Mn)  in  (14-12)  (with  a  =  h)  to  the  constant 
function  1  G  FL2(Cn,  /iff).  Then  V  may  be  computed  as  follows: 

(Vfj){z)  =  (tt h)~nB  J  exp  |  —  ^  [z  -  z  —  2a/2z  •  x  +  x  •  |  -0(x)  dx. 

Recall  that  we  define  a  •  b  =  JN  ajbj  for  all  a,  b  G  Cn,  with  no  complex 
conjugates  in  the  definition.  In  particular,  the  integrand  in  the  formula  for 
Vip  is  a  holomorphic  function  of  z,  for  each  fixed  x. 

Note  that  the  value  of  (V^)(z)  at  z  =  0  is  simply  the  inner  product  of  f 
with  the  ground  state  function  </>o,  with  a  =  h.  The  proof  of  Theorem  14.18 
will  show  that  the  value  of  (R^)(z)  at  an  arbitrary  z  is  a  certain  constant 
cz  times  the  inner  product  of  fj  with  a  phase  space  translate  of  </>o,  that  is, 
a  vector  of  the  form  eia/Kezh'r <fo.  [See  (14.36).]  According  to  (the  obvious 
higher-dimensional  counterpart  to)  Proposition  12.11,  is  a  minimum  un¬ 
certainty  state,  meaning  that  equality  is  achieved  in  Corollary  12.9  for  each 
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j.  Thus,  by  (the  obvious  higher-dimensional  counterpart  to)  Exercise  3  in 
Chap.  12,  each  state  of  the  form  e2a‘xe2b'p</>o  is  also  a  minimum  uncertainty 
state. 

Proof.  By  the  unitarity  of  V  and  the  z  =  0  case  of  Proposition  14.17,  we 
have 


(0o,  ^)l2(R n)  —  (V0QiV^)'UL2(C'n,iih)  —  (1?  ^/"0)'KL2(Cri,/ift)  —  (C0)( o). 

Thus,  the  value  of  Vt]j  at  0  is  just  the  inner  product  of  ^  with  </> q.  More 
generally, 

(e_ia’xe_ib’p0o,  ip)  =  (^o,elb'Pe!a-X4 

=  (V<p0  ,7eib,peia'x^) 

=  (l,eib'Beia'ArV>) 

=  (eibBeiaAV4/0(O),  (14.35) 

where  e*a  A  means  the  product  (in  any  order)  of  the  operators  ela<  A  < .  and 
similarly  for  e'b  B . 

Recall  that  A/’s  and  s  are  defined  as  the  infinitesimal  generators 
of  the  groups  Uj  and  Vj  in  Theorem  14.16,  which  in  turn  are  defined  in 
terms  of  the  operators  Ta.  If  we  use  (14.31)  to  compute  the  right-hand  side 
of  (14.35),  we  obtain 

(eibBeiaAV'V')(0)  =  (Tb/V-2Tta/V-2V^)(  0) 

=  eifta'b/2U(b+ia)/^)(0) 

=  eiaa'b/2e-?i(,a|2+|b|2)/4(V'^)(Vb  +  ia)/V2). 


Thus,  if  we  apply  (14.35)  with  a  =  \Z2yo /h  and  b  =  v^xo/fi,  we  obtain 

e-iv^yo-x/fte-iv^xo-P/ft0O}  W 


_  ^xo-yo/fr  -(|x0|2  +  |yo|2)/(2fr) 


=  e 


(E'0)(xo  +  ^yo)' 


(14.36) 


Solving  (14.36)  for  (V4/0(xo  +  iyo)  gives 


(E'0)(xo  +^yo)  =  (tt^)  "/ ‘e  ■'“u*7U/,“e 


n/4  — ix0-yo/frR|x0|2  +  |yo|2)/(2fr) 


X 


eiV72y0-^/he 


x 


V2xo|2/(2fi)^,(x)  rfXj 


)  n 


which  simplifies  to  the  claimed  formula  for  V'lp.  ■ 


14.5  Exercises 

1.  Show  that  if  operators  A  and  B  satisfy  the  exponentiated  commu¬ 
tation  relations  of  Sect.  14.2,  they  satisfy  the  “semi-exponentiated” 
commutation  relations,  that  is,  the  hypotheses  of  Theorem  12.8. 
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Hint :  For  any  a,  s  G  R  and  'ip  G  Dom(H),  rearrange  the  expression 


exsA^eiaB _  ^iclB^ 


s 


using  the  exponentiated  commutation  relations.  Then  let  5  tend  to 
zero  and  apply  Stone’s  theorem. 


2.  (a)  Suppose  a  :  R  -T  6(H)  is  a  differentiable  map,  meaning  that 

.  ait  +  h)  —  ait ) 

Inn  - - - 

h — >*0  h 

exists  in  the  norm  topology  of  6(H)  for  each  t.  Show  that  if 
da/dt  =  0  for  all  £,  then  a  is  constant. 

(b)  Suppose  a  :  R  -T  6(H)  is  a  differentiable  map  such  that 

da 
dt 

for  some  fixed  E  6(H).  Show  that  <a(t)  =  a(0)etA  for  all  t. 


3.  Show  that  the  operators  Aj  :=  P}  and  B:)  :=  —  Xj  on  L2(Mn)  sat¬ 
isfy  the  exponentiated  commutation  relations.  Determine  the  unitary 
operator  U  :  L2(Mn)  — >■  L2(Mn)  (unique  up  to  a  constant)  such  that 

UeitAjU~1  =  eitx> 

UeitBjU~1  =  eitPj. 


4.  Verify  that  the  operators  Ua^b{t)  in  (14.9)  form  a  strongly  continuous 
one-parameter  unitary  group. 

5.  In  this  exercise,  we  develop  a  discrete  version  of  (the  n  =  1  case  of) 
the  Stone-von  Neumann  theorem.  Let  p  be  a  prime  number,  let  TLjp 
denote  the  field  of  integers  modulo  p,  and  let  h  be  a  nonzero  ele¬ 
ment  of  Z jp.  Consider  the  finite-dimensional  Hilbert  space  L2(Z/p), 
taken  with  respect  to  the  counting  measure  on  TLjp.  Let  U  denote  the 
“modulation”  operator 


(17/)  (n)  =  e27Tin/p  f(n) 

and  let  V  denote  the  “translation”  operator  on  i'J(Z/p).  given  by 

C Vf)(n )  =  f(n  +  h). 


In  the  case  of  the  modulation  operator,  note  that  the  expression 
e2? Tin/p  descends  unambiguously  from  nEZtonEZ/p. 
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(a)  Verify  that  Up  =  Vp  =  I  and  that,  for  all  l  and  m  in  Z, 


UlVm  =  e 


2'Kilm  /  p  ym  jjl 


(b)  Suppose  now  that  A  and  B  are  unitary  operators  on  a  finite¬ 
dimensional  Hilbert  space  H  satisfying  Ap  =  Bp  =  I  and 


AlBm  =  e 


2tt  ilm  /  p 


Suppose  also  that  the  only  subspaces  of  H  invariant  under  both 
A  and  B  are  {0}  and  H.  Show  that  there  is  a  unitary  map  W 
from  H  to  L2(Z/p)  such  that 

WAW-1  =  U 
WBW  1  =  V. 


Hint :  Show  that  if  v  G  H  is  an  eigenvector  for  A,  then  so  is 
Blv  for  any  l.  Show  that  each  eigenspace  for  A  has  dimension  1 
and  identify  the  associated  eigenvectors  with  the  “^-functions” 
in  L2(Z/p). 

6.  Given  a  constant  mgC  with  \u\  =  1  and  a  pair  of  vectors  a,  b  £  Mn, 
let  UU: a,b  be  the  unitary  operator  on  L2(Mn)  given  by 

(UUt  a,b^)(x)  =  wem'x^(x  +  hb). 

(a)  Verify  that  the  set  of  operators  of  this  form  a  group  under  the 
operation  of  composition,  and  denote  this  group  by  Hn. 

(b)  Let  Hn  denote  the  set  of  (n  +  2)  x  (n  +  2)  matrices  of  the  form 

/  1  CL\  •  •  •  CLn  C  \ 

1  b1 

A=  *.  : 

•  •  1 

1  K 

\  i  7 

with  ai, . . . ,  an  and  foi , . . . ,  bn  in  R.  (The  only  nonzero  entries 
in  A  are  on  the  main  diagonal,  in  the  first  row,  and  in  the  last 
column.)  Verify  that  Hn  forms  a  group  under  matrix  multipli¬ 
cation.  Show  that  there  is  a  surjective  group  homomorphism 
<f>  :  Hn  Hn  with  discrete  kernel. 

Hint :  Compare  the  formulas  for  group  multiplication  in  Hn 
and  Hn. 

Note :  In  the  language  of  Chap.  16,  Hn  is  the  universal  covering  group 
of  Hn.  The  group  Hn  is  called  the  Heisenberg  group. 
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7.  Show  by  direct  computation  that  the  operators  Ta  in  (14.30)  satisfy 
the  relations  (14.31). 


8.  Using  dominated  convergence,  show  that  for  every  holomorphic  poly¬ 
nomial  F  on  Cn,  we  have 


lim 

a— >b 


TaF  -  ThF 


2 


0 


where  Ta  is  as  in  (14.30). 


15 

The  WKB  Approximation 


15.1  Introduction 

The  WKB  method,  named  for  Gregor  Wentzel,  Hendrik  Kramers,  and  Leon 
Brillouin,  gives  an  approximation  to  the  eigenfunctions  and  eigenvalues  of 
the  Hamiltonian  operator  H  in  one  dimension.  The  approximation  is  best 
understood  as  applying  to  a  fixed  range  of  energies  as  h  tends  to  zero.  (It 
is  also  reasonable  in  many  cases  to  think  of  the  approximation  as  applying 
to  a  fixed  value  of  h  as  the  energy  tends  to  infinity.) 

The  idea  of  the  WKB  approximation  is  that  the  potential  function  V (x) 
can  be  thought  of  as  being  “slowly  varying,”  with  the  result  that  solutions 
to  the  time-independent  Schrodinger  equation  will  look  locally  like  the  so¬ 
lutions  in  the  case  of  a  constant  potential.  In  the  classically  allowed  region, 
this  line  of  thinking  will  yield  an  approximation  consisting  of  a  rapidly  os¬ 
cillating  complex  exponential  multiplied  by  a  slowly  varying  amplitude.  We 
make  the  “local  frequency”  of  the  exponential  equal  to  what  it  would  be  if 
V  were  constant.  Having  made  this  choice,  there  is  a  unique  choice  for  the 
amplitude  that  yields  an  error  that  is  of  order  h2 .  This  amplitude,  however, 
tends  to  infinity  as  we  approach  the  “turning  points,”  that  is,  the  points 
where  the  classical  particle  changes  directions.  Similarly,  in  the  classically 
forbidden  region,  we  obtain  approximate  solutions  that  are  rapidly  grow¬ 
ing  or  decaying  exponentials,  multiplied  by  a  slowly  varying  factor.  Again, 
there  is  a  unique  choice  for  the  slowly  varying  factor  that  gives  errors  of 
order  ft2 ,  and  again,  this  factor  blows  up  at  the  turning  points. 
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The  difficulty  near  the  turning  points  means  that  we  cannot  directly 
“match”  the  approximate  solutions  in  different  regimes  the  way  we  did  in 
Chap.  5.  Instead,  we  will  use  the  Airy  function  to  approximate  the  solution 
to  the  Schrodinger  equation  near  the  turning  points.  Asymptotics  of  the 
Airy  function  will  then  yield  the  appropriate  matching  condition,  which 
turns  out  to  be  a  corrected  form  of  the  Bohr-Sommerfeld  rule  that  appears 
in  the  “old”  quantum  theory. 

15.2  The  Old  Quantum  Theory  and  the 
Bohr-Sommerfeld  Condition 

The  old  quantum  theory,  developed  by  Bohr,  Sommerfeld,  and  de  Broglie, 
among  others,  may  be  pictured  as  follows.  Consider,  for  simplicity,  a  par¬ 
ticle  with  one  degree  of  freedom,  and  let  C  be  a  level  set  in  phase  space  of 
the  Hamiltonian, 


C  =  {  (x,p)  G  M2|  H(x,p)  =  E }  ,  (15.1) 


which  we  assume  to  be  a  closed  curve.  We  now  imagine  drawing  a  “wave” 
on  C,  that  is,  some  oscillatory  function  defined  over  C.  Following  the  de 
Broglie  hypothesis  (Sect.  1.2.2),  we  postulate  that  the  local  frequency  k  of 
the  wave  as  a  function  of  x  is  p/h.  This  means  that  the  phase  of  our  wave 
should  be  obtained  by  integrating  the  1-form 


—p  dx 
h 


(15.2) 


along  the  curve.  Thus,  the  wave  itself  can  be  pictured  as  a  function  on  C 
of  the  form 


cos 


(15.3) 


where  Xq  is  some  arbitrary  starting  point  on  the  curve  C  and  where  S  is  an 
arbitrary  phase.  Note  that  the  old  quantum  theory  did  not  offer  a  physical 
interpretation  of  this  wave;  it  was  simply  a  crude  attempt  to  introduce 
waves  into  the  picture. 

The  Bohr-Sommerfeld  condition  is  simply  the  requirement  that  the  func¬ 
tion  in  (15.3)  should  match  up  with  itself  when  we  go  all  the  way  around 
the  curve.  This  will  happen  precisely  if 


1 

h 


dx  =  2tt  n, 


(15.4) 


for  some  integer  n.  The  energy  levels  in  the  old  quantum  theory  were  taken 
to  be  those  numbers  E  for  which  the  corresponding  level  curve  C  sat¬ 
isfies  the  Bohr-Sommerfeld  condition  (15.4).  Although  Bohr-Sommerfeld 
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quantization  had  some  successes,  notably  explaining  the  energy  levels  of 
the  hydrogen  atom,  it  ultimately  failed  to  correctly  predict  the  energies  of 
complex  systems. 

For  systems  with  one  degree  of  freedom,  a  vestige  of  the  Bohr-Sommerfeld 
approach  survives  in  modern  quantum  theory,  with  two  modifications. 
First,  the  condition  (15.4)  has  to  be  corrected  by  replacing  the  n  by  n  +  1/2 
on  the  right-hand  side  of  (15.4).  (The  replacement  of  n  by  n+  1/2  is  known 
as  the  Maslov  correction.)  Second,  this  condition  does  not  (in  most  cases) 
give  the  exact  energy  levels,  but  only  the  leading-order  semiclassical  ap¬ 
proximation  to  the  energy  levels.  The  preceding  discussion  leads  to  the 
following  definition. 

Condition  15.1  A  number  E  is  said  to  satisfy  the  Maslov- corrected  Bohr- 
Sommerfeld  condition  if 


1 

h 


p  dx  =  2tt (n  +  1/2) 


(15.5) 


for  some  integer  n,  where  C  is  the  classical  energy  curve  in  (15.1).  In  light 
of  Green’s  theorem ,  this  condition  may  be  rewritten  as 

1  1 

— -{Area  enclosed  by  C)  =  n  H — . 

2ttH  2 

When  the  Maslov  correction  is  included,  the  Bohr-Sommerfeld  condition 
can  be  stated  as  saying  that  the  wave  with  phase  given  by  integrating  the 
1-form  in  (15.2)  should  be  180°  out  of  phase  with  itself  after  one  trip  around 
the  energy  curve.  Figure  15.1  shows  an  example,  which  should  be  contrasted 
with  Fig.  1.3.  (Note  also  that  Fig.  1.3  is  drawn  in  the  configuration  space, 
whereas  Fig.  15.1  is  in  the  phase  space.) 

In  our  analysis  in  the  subsequent  sections,  we  will  see  that  the  Maslov 
correction — that  is,  the  extra  1/2  in  (15.5),  as  compared  to  (15.4) — actually 
consists  of  a  contribution  of  1/4  from  each  of  the  two  “turning  points”  of 
the  classical  particle.  (The  turning  points  are  the  points  where  the  classical 
particle  changes  directions.)  Specifically,  in  the  WKB  approximation,  the 
phase  of  the  wave  function  will  be  computed  as  the  integral  of  (p  dx)/h 
along  one  “branch”  of  the  classical  energy  curve  C.  Using  the  Airy  function 
to  approximate  the  wave  function  near  the  turning  points,  we  will  obtain 
an  “extra”  7r/4  of  phase  between  each  turning  point  and  the  last  local 
maximum  or  minimum  of  the  wave  function.  Because  of  the  two  branches 
of  (7,  the  extra  7r/4  of  phase  near  each  of  the  two  turning  points  actually 
contributes  an  extra  i r  to  the  integral  on  the  left-hand  side  of  (15.5). 

The  reader  may  wonder  why  there  is  no  comparable  correction  term 
in  our  discussion  of  the  Bohr-de  Broglie  model  of  the  hydrogen  atom  in 
Sect.  1.2.2.  One  way  to  answer  this  question  is  as  follows.  As  we  will  see  in 
Sect.  18.1,  the  Schrodinger  operator  for  the  hydrogen  atom  can  be  reduced 
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P 


FIGURE  15.1.  A  trajectory  satisfying  the  corrected  Bohr-Sommerfeld  condition 
with  n  =  10. 


to  a  one-dimensional  Schrodinger  operator  with  an  effective  potential  of  the 
form 

Q2  |  h2i(i  + 
r  2  mr2 


Here  l  is  a  non-negative  integer  that  labels  the  “total  angular  momentum” 
of  the  wave  function.  At  least  when  l  >  0,  one  can  analyze  this  Schrodinger 
operator  using  a  WKB-type  analysis  very  similar  to  the  one  in  the  current 
chapter,  with  one  important  modification:  The  radial  wave  function  [the 
quantity  h(r)  in  (18.5)]  must  be  zero  at  r  =  0  in  order  for  the  wave  function 
to  be  in  the  domain  of  the  Hamiltonian. 

If  one  analyzes  the  situation  carefully,  it  turns  out  that  the  zero  boundary 
condition  at  r  =  0  introduces  another  correction  into  the  Bohr-Sommerfeld 
condition  in  the  amount  of  1/2.  There  is  still  also  a  correction  of  1/4  for 
each  of  the  two  turning  points,  leading  to  the  condition 


7  fill 

p  dx  =  2tt  I  n-b  -  +  -  +  - 

Since  n  +  1  is  again  an  integer,  we  are  effectively  back  to  the  uncorrected 
Bohr-Sommerfeld  condition.  See  Chap.  11  of  [8]  for  a  discussion  of  different 
approaches  to  the  WKB  approximation  for  radial  potentials. 


^  =  27r(n  +  1). 


15.3  Classical  and  Semiclassical  Approximations 

We  are  interested  in  finding  approximate  solutions  to  the  time-independent 
Schrodinger  equation, 


H2  d2if> 


+  (V(x)  —  E)'ip(x)  =  0 


2  rn  dx 2 


(15.6) 
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for  small  values  of  h.  Ultimately,  we  will  need  to  analyze  the  behavior  of 
solutions  in  three  different  regions,  the  classically  allowed  region  [points 
where  V(x)  <  E],  the  classically  forbidden  region  (points  where  V(x)  > 
E),  and  the  region  near  the  “turning  points,”  that  is,  the  points  where 
V(x)  =  E. 

Let  us  consider  at  first  the  classically  allowed  region.  Given  a  potential 
V  and  an  energy  level  E ,  we  can  solve  (up  to  a  choice  of  sign)  for  the 
momentum  of  a  classical  particle  as  a  function  of  position  as 

p(x)  =  a/2  m(E  —  V(x)). 

We  look  for  approximate  solutions  ip  to  (15.6)  of  the  form 

ip(x)  =  A(x)e±lS<yX^h ,  (15.7) 


where  S  satisfies  S'(x) 
wave  function  to  be 


=  p(x).  Note  that  we  are  taking  the  phase  of  our 


n  1 

phase  =  =b  — 
n 


p(x)  dx, 


as  in  the  old  quantum  theory  in  Sect.  15.2.  The  “amplitude  function”  A(pc) 
will  be  chosen  to  be  independent  of  h  and  thus  “slowly  varying”  (for  small  K) 
compared  to  the  exponent  S(x)/h. 

Our  first,  elementary,  result  is  that  for  any  number  E  for  which  there  is 
a  classically  allowed  region  and  for  any  reasonable  choice  of  the  amplitude 
A{x)  in  (15.7),  we  obtain  an  approximate  eigenvector  solution  to  the  time- 
independent  Schrodinger  equation,  with  an  error  term  of  order  H. 


Proposition  15.2  For  any  two  numbers  E\  and  E2  with  E\>  inf^^  V (x), 
there  exists  a  constant  C  and  a  nonzero  function  A  E  (M)  with  the 
following  property.  For  every  E  E  [^1,^2],  the  support  of  A  is  contained 
in  the  classically  allowed  region  at  energy  E  and  the  function  ip  given  by 


ip(x)  =  A(x)  exp  \  ±-  /  p{x)  dx 


satisfies 


\Hfj  —  Efj\\  <  Ch\\pj 


(15.8) 


Proof.  For  any  E  E  [^1,^2],  the  classically  allowed  region  for  energy  E 
contains  the  classically  allowed  region  for  energy  E\.  We  choose,  then,  A  to 
be  any  nonzero  element  of  C£°(R)  with  support  in  the  classically  allowed 


region  for  energy  E\.  If  we  evaluate  Hfj  —  Eip  by  direct  calculation,  there 
will  a  term  in  which  two  derivatives  fall  on  the  exponential  factor,  bringing 
down  a  factor  involving  p(x)2 .  The  definition  of  p{x)  is  such  that  the  term 


310 


15.  The  WKB  Approximation 


involving  p{x)2  will  cancel  the  term  involving  V(x)  —  E,  leaving  us  with 
Hip  —  Eip  =  —  f. A"  (pc )  T  ^2 A'  (x)p(x)  T  jypf  (x)A(x) 


x  exp  T-  /  p(x)  [>  . 


(15.9) 


(Here,  each  occurrence  of  the  symbol  =b  has  the  same  value,  either  all  pluses 
or  all  minuses.)  Thus, 


Hip  —  Eip  ||  < 


h2  h 

—  \\A"\\  +  — 

2  m  2m 


2  A'p  +  Ap' 


(15.10) 


Since  H^H  is  independent  of  h ,  the  right-hand  side  of  (15.10)  is  of  order 
h\\ip\\  .  It  is  easy  to  check  that  ||2 A'p-\-  Ap'\\  is  bounded  as  a  function  of  E 
for  any  E  in  the  range  [Td,£y  and  the  result  follows.  ■ 

Proposition  15.2,  along  with  elementary  spectral  theory,  tells  us  that  for 
any  E  larger  than  the  minimum  of  V,  there  is  a  point  E  in  the  spectrum 
of  H  such  that 


E  —  E  <  ch 


(15.11) 


(See  Exercise  4  in  Chap.  10.)  If  we  assume  that  V(x)  tends  to  Too  as 
x  — >■  Too,  then  H  will  have  discrete  spectrum  and  we  can  say  that  E  is 
an  eigenvalue  for  H.  The  conclusion,  for  such  potentials,  is  this:  Given  any 

number  E  E  [E\,  E2],  there  is  an  eigenvalue  of  H  within  Ch  of  E.  Thus,  as 

/\ 

h  tends  to  zero,  the  eigenvalues  of  H  “fill  up”  the  entire  range  of  values  of 
the  classical  energy  function. 

Proposition  15.2  is  one  manifestation  of  the  “classical  limit”  of  quantum 
mechanics:  the  quantum  energy  spectrum  is,  in  a  certain  sense,  approxi¬ 
mating  the  classical  energy  spectrum  as  h  gets  small.  Notice,  however,  that 
this  result  tells  us  only  that  the  eigenvalues  are  at  most  order  h  apart  and 
nothing  further  about  the  location  of  the  individual  eigenvalues. 

In  this  chapter,  we  will  show  that  if  E  satisfies  the  corrected  Bohr- 
Sommerfeld  condition,  then  there  exists  an  eigenvalue  E  of  H  such  that 


E-E  <  Ch9/8. 


(15.12) 


An  estimate  of  the  form  (15.12)  locates  eigenvalues  with  an  error  bound 
that  is  small  compared  to  the  expected  average  spacing  between  the  eigen¬ 
values,  which  is  of  order  h.  On  the  other  hand,  the  approximate  energy 
levels  E  are  determined  by  Condition  15.1,  which  is  a  condition  on  the 
classical  energy  curve.  Thus,  (15.12)  can  be  described  as  a  semiclassi- 
cal  estimate:  It  is  estimating  quantum  mechanical  quantities  (the  indi¬ 
vidual  energy  levels)  in  classical  terms  (the  level  curves  of  the  classical 
Hamiltonian). 
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15.4  The  WKB  Approximation  Away 
from  the  Turning  Points 

We  consider  only  the  simplest  interesting  case  of  the  WKB  approximation, 
in  which  the  following  assumption  holds.  See  the  book  of  Miller  [30]  for 
much  about  this  sort  of  asymptotic  analysis. 

Assumption  15.3  Consider  a  smooth,  real-valued  potential  V(x),  with 
V(x)  -A  Too  as  x  -A  Too.  Assume  that  the  functions  V'(x)/V(x)  and 
V"  (x) /V  (x)  are  bounded  for  x  near  Too. 

Consider  also  a  range  of  energies  of  the  form  E\  <  E  <  TV  Assume 
that  for  each  E  in  this  range,  there  are  exactly  two  points,  a(E )  and  b{E ), 
with  a(E)  <  b(E ),  for  which  V(x)  =  E.  Further  assume  that  the  derivative 
ofV  is  nonzero  at  a(E )  and  b{E),  for  all  E  £  [E\,E2\. 

See  Fig.  15.2  for  a  typical  example.  Since  V  is  locally  bounded  and  tends 
to  Too  at  infinity,  H  is  essentially  self-adjoint  on  Cp°  (M)  (Theorem  9.39) 
and  has  purely  discrete  spectrum  (Theorem  XIII.  16  in  Volume  IV  of  [34]). 
The  assumption  that  V' /V  and  V" /V  be  bounded  near  infinity  is  stronger 
than  necessary,  but  still  applies  to  most  of  the  interesting  cases. 

We  refer  to  a{E)  and  b(E)  as  the  turning  points ,  since  these  are  the 
points  where  a  classical  particle  with  energy  E  changes  direction.  When 
the  energy  E  is  understood  as  being  fixed,  we  will  write  the  turning  points 
simply  as  a  and  b. 


15.4.1  The  Classically  Allowed  Region 


As  in  Sect.  15.3,  we  seek  approximate  solutions  to  the  time-independent 
Schrodinger  equation  having  the  following  form  in  the  classically  allowed 
region: 


ip  =  A(x)  exp 


p(x)  dx 


(15.13) 


where  p(x)  =  y/2 m(E  —  V (x))  is  the  momentum  of  a  classical  particle  with 
energy  E  and  position  x.  According  to  (15.9),  this  form  for  ip  gives 


hf  f  i  i 

Hip  —  Eip  = - (  A" (x)  T  -2A'(x)p(x)  T  -p' (x)A(x) 

2  m  \  h  a 


x  exp  \  T—  /  p(x)  dx 


(15.14) 


Since  we  want  to  obtain  an  approximate  solution  with  an  error  smaller 
than  h ,  we  require  that  the  second  and  third  terms  in  parentheses  in  (15.14) 
cancel.  This  cancellation  will  occur  if  A  satisfies 


2  A'(x)p(x)  =  —p'(x)A(x) 
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or 

A'(x)  1  p'(x) 

A{x)  2  p(x)  ’ 

which  we  can  easily  solve  (Exercise  3)  as 

A(x)  =  C(p(x))_1^2. 
If  A  is  given  by  (15.16),  we  will  have 


Hip-Eip 


h2  A"  {x) 
2 m  A(x) 


(15.15) 


(15.16) 


(15.17) 


indicating  that  our  error  is  of  order  h2 .  This  expression,  however,  is  only 
local,  in  that  it  applies  only  in  the  classically  allowed  region.  Furthermore, 
p(x)  tends  to  zero  at  the  turning  points,  which  means  that  A{x)  becomes 
unbounded  at  these  points.  This  blow-up  of  the  amplitude  is  a  substantial 
complicating  factor  in  the  analysis. 

We  can  get  an  approximate  solution  to  the  Schrodinger  equation  by  tak¬ 
ing  a  linear  combination  of  the  function  in  (15.13)  with  two  different  choices 
for  the  sign  in  the  exponent,  with  constants  ci  and  C2.  It  is  convenient  to 

take  the  basepoint  of  our  integration  to  be  the  left-hand  turning  point 

/\ 

a  =  a{E).  Furthermore,  since  the  Schrodinger  operator  H  commutes  with 
complex  conjugation,  the  real  and  imaginary  parts  of  any  solution  to  the 
time-independent  Schrodinger  equation  is  again  a  solution.  We  will  there¬ 
fore  consider  only  real- valued  approximate  solutions,  i.e.,  those  in  which 
C2  =  cT.  Using  Exercise  1,  we  can  then  write  our  approximate  solution  as 
follows. 
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Summary  15.4  Suppose  ip  is  a  real-valued  solution  to  the  time-independent 
Schrodinger  equation.  Then  in  the  classically  allowed  region  but  away  from 
the  turning  points,  we  expect  that  ip  is  well  approximated  by  an  expression 
of  the  form 


R 

—j=  cos 

Vp(x) 


1 

h 


‘X 


P(y)  dy-6 


a 


(15.18) 


where  p(pc)  =  y/2 m(E  —  V (x))  is  the  momentum  of  a  classical  particle  with 
energy  E  and  position  x.  Here  R  and  5  are  real  constants,  referred  to  as 
the  amplitude  and  the  phase  of  the  approximate  solution. 


We  refer  to  the  function  in  (15.18)  as  the  oscillatory  WKB  function.  In 
integrating  the  square  of  the  oscillatory  WKB  function  over  some  interval, 
we  may  apply  the  identity  cos2  6  =  (1  +  cos(20))/2  to  the  cosine  factor. 
The  rapidly  oscillating  cos(20)  term  will  be  small  for  small  h  because  of 
cancellation  between  positive  and  negative  values.  Thus,  the  integral  of 
ip2(x)  over  an  interval  will  be,  to  leading  order,  just  a  constant  times  the 
integral  of  l/p(x),  or,  equivalently,  a  constant  times  l/v(x),  where  v  is 
the  velocity  of  the  classical  particle.  But  the  integral  of  l/v(x)  =  dt/dx 
with  respect  to  x  is  just  the  time  t  that  the  classical  particle  spends  in  the 
interval.  We  obtain,  then,  the  following  result. 


Conclusion  15.5  If  the  amplitude  R  in  (15.18)  is  chosen  so  that  if  has 
L 2  norm  1  over  [a,  b],  then  the  probability  of  finding  the  quantum  particle  in 
an  interval  [c,  d]  C  [a,  b]  is  approximately  the  fraction  of  time  the  classical 
particle  spends  in  [c,  d]  over  one  period  of  classical  motion. 


15.4.2  The  Classically  Forbidden  Region 

In  the  classically  forbidden  region,  let  us  introduce  the  quantity 

q(x)  :=  a/2 m(V (x)  —  E ). 

We  look  for  approximate  solutions  to  the  Schrodinger  equation  (15.6)  of 
the  form 

ip(x)  =  A{x)  exp  |±1  J  q(y)  dy 

If  we  analyze  approximate  solutions  of  this  form  precisely  as  in  the  classi¬ 
cally  allowed  region,  we  again  find  that  there  is  a  unique  choice  for  A  (up 
to  multiplication  by  a  constant)  that  causes  the  order- h  terms  in  Hip  —  Eip 
to  cancel,  namely  A{x)  =  C(q(x))~1F ,  If  we  are  hoping  to  approximate  a 
square-integrable  solution  of  the  Schrodinger  equation,  we  want  to  take  a 
minus  sign  in  the  exponent  on  the  interval  (b,  00),  and  it  is  convenient  to 
the  basepoint  of  our  integration  to  be  b.  In  the  region  (—00,  a),  we  want  to 
take  a  plus  sign  in  the  exponent;  it  is  then  convenient  to  take  the  basepoint 
of  our  integration  to  be  a  and  to  reverse  the  direction  of  integration,  which 
changes  the  sign  in  the  exponent  back  to  being  negative. 
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FIGURE  15.3.  The  WKB  functions,  extended  all  the  way  to  the  turning  points. 


Summary  15.6  Iffj i(x)  is  a  solution  to  the  time-independent  Schrodinger 
equation  that  tends  to  zero  as  x  approaches  — oo,  we  expect  that  Vt  will  be 
well  approximated  on  (— oo,a),  but  away  from  the  turning  point,  by  the 
expression 


ci 

—j=  exp 

Vi(x) 


q(y)  dy } , 


(15.19) 


where  q(x)  =  y2rn(V ( x )  —  E).  Meanwhile,  if  ip2(x)  is  a  solution  to  the 
time-independent  Schrodinger  equation  that  tends  to  zero  as  x  approaches 
Too,  we  expect  that  if  will  be  well  approximated  on  (5,  Too),  but  away  from 
the  turning  point,  by  the  expression 


C2 

-y==exP 

VQ\X) 


q{y )  dy  }■ . 


(15.20) 


We  refer  to  the  functions  in  (15.19)  and  (15.20)  as  the  exponential  WKB 
functions.  The  general  theory  of  ordinary  differential  equations  tells  us  that 
any  solution  to  the  time-independent  Schrodinger  equation  for  a  smooth 
potential  is  smooth.  Thus,  the  singularity  at  the  turning  points  is  an  artifact 
of  our  approximation  method.  Nevertheless,  for  small  values  of  h ,  the  true 
solution  will  “track”  the  WKB  approximation  until  x  gets  very  close  to 
the  turning  point,  with  the  result  that  the  true  solution  will  be  large,  but 
finite,  near  the  turning  points. 

Figure  15.3  plots  a  potential  function  V(x),  an  energy  level  E,  and  the 
WKB  functions  in  both  the  classically  allowed  and  classically  forbidden 
regions.  In  the  figure,  the  WKB  functions  have  been  (improperly)  used  all 
the  way  up  to  the  turning  points. 
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For  any  constant  ci  and  any  energy  level  E,  we  expect  that  there  is  a  unique 
solution  of  the  Schrodinger  equation  (15.6)  that  is  well  approximated 
for  x  tending  to  —  oo  by  a  function  of  the  form  (15.19).  We  expect  that  this 
solution  will  be  well  approximated  in  the  classically  allowed  region  (but 
not  too  close  to  the  turning  points)  by  a  function  of  the  form  (15.18)  for 
a  unique  pair  of  constants  R  and  S.  In  this  section,  we  will  see  that  the 
correct  choices  for  R  and  S  are 

R  =  2ci,  S=J-  (15.21) 

The  formula  (15.21)  for  R  and  S  is  called  a  connection  formula ;  there  is  a 
similar  formula  connecting  an  approximate  solution  that  tends  to  zero  as  x 
tends  to  Too  to  an  approximate  solution  in  the  classically  allowed  region. 
By  comparing  the  two  connection  formulas,  we  will  obtain  conditions  on 
the  energy  E  under  which  the  two  approximate  solutions  (one  that  decays 
near  —  oo  and  one  that  decays  near  Too)  agree  up  to  a  constant  in  the 
classically  allowed  region.  The  condition  on  E  will  turn  out  to  be  precisely 
Condition  15.1. 

The  discussion  in  the  previous  paragraph  should  be  compared  to  the 
analysis  in  Chap.  5,  where  we  determined  the  constants  for  the  solution 
inside  the  well  in  terms  of  the  energy  level  and  the  constant  in  front  of 
the  exponentially  decaying  solution  outside  the  well.  Here,  of  course,  the 
analysis  is  more  complicated  because  neither  of  the  approximations  (15.19) 
or  (15.18)  is  valid  near  the  turning  point.  The  connection  formula  will  be 
obtained,  then,  by  using  the  Airy  equation  to  approximate  the  Schrodinger 
equation  near  the  turning  points. 

To  get  a  reasonable  approximation  of  our  wave  function  near  the  turning 
points,  we  approximate  V  locally  by  a  linear  function.  (By  contrast,  in  the 
WKB  functions,  we  are  essentially  thinking  of  V  as  being  locally  constant.) 
Thus,  for  example,  near  the  turning  point  a,  we  write  V(x)  ~  (a  —  x)Fo, 
where  Fq  =  —  Vf(a),  yielding  the  approximate  equation 


h2  cPijj 
2 m  dx2 


x)Fofj  -  0. 


By  making  the  change  of  variable 


u  = 


/2mFp\  1/3 


(15.22) 


we  can  reduce  the  equation  to 

d2if> 
du 2 


mf(u)  =  0, 


(15.23) 
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which  is  the  Airy  equation. 

Equation  (15.23)  has  two  linearly  independent  solutions,  denoted  Ai (u) 
and  Bi('u).  We  are  interested  in  the  solution  Ai(n),  since  this  is  the  one 
that  decays  for  u  >  0,  that  is,  for  x  <  a.  The  function  Ai (u)  is  defined  by 
the  following  convergent  improper  integral 

1  ft3  \ 

Ai (u)  =  —  J  cos  (  —  +  ut)  dt.  (15.24) 

Intuitively,  convergence  is  due  to  the  very  rapid  oscillation  of  the  integrand 
for  large  t,  which  produces  a  cancellation  between  the  positive  and  nega¬ 
tive  values  of  the  cosine  function.  Rigorously,  convergence  can  be  proved 
using  integration  by  parts,  as  in  Exercise  6.  By  differentiating  under  the 
integral  sign  (Exercise  7),  one  can  show  that  Ai  indeed  satisfies  the  Airy 
equation  (15.23). 

As  \u\  gets  large,  the  integrand  in  (15.24)  becomes  more  and  more  rapidly 
oscillating,  producing  more  cancellation.  The  only  exception  to  this  behav¬ 
ior  is  when  the  derivative  (with  respect  to  t )  of  the  function  t3  /  3 -\-ut  is  zero. 
Near  such  a  point,  the  argument  of  the  cosine  function  is  changing  slowly 
and  there  is  little  oscillation.  If  u  is  negative,  there  is  a  unique  critical  point 
of  t3 / 3  +  ut,  at  t  —  \J—u,  and  we  expect  that  the  main  contribution  to  the 
integral  in  (15.24)  will  come  from  t  ~  —u.  If  u  is  positive,  t3 / 3 -\- ut  has  no 

critical  points,  and  we  expect  that  the  integral  in  (15.24)  will  become  quite 
small  as  u  tends  to  +oo.  This  sort  of  reasoning  can  be  used  to  determine 
the  precise  asymptotics  of  the  Airy  function  as  u  tends  to  -Too  and  as  u 
tends  to  — oo;  see  the  discussion  following  (15.32)  and  (15.33). 

We  now  state  our  main  result,  which  will  be  derived  in  the  remainder  of 
this  section.  The  result  is  not  rigorous,  because  we  have  not  estimated  any 
of  errors  involved;  such  error  estimates  will  be  performed  in  Sect.  15.6. 


Claim  15.7  If  'ipi  is  a  solution  of  the  Schro  dinger  equation  (15.6)  that 
tends  to  zero  near  —  oo,  then  ?/t  can  be  normalized  so  that  the  following 
approximations  hold 


V’l  (x) 


'lpl(x) 


lpl{x) 


1 


2 \/q(x) 


exp 


1 

h 


>a 


q(y)  dy  >  (near  —  oo) 


(15.25) 


X 


(2mF0ft)1/6 


x)  (near  x  =  a)  (15.26) 


i  r  i 

. - cos  <  - 

Vp(x) 


j  P(y)  dy  -  A 


(a  <  x  <  b). 


(15.27) 


Here  Fq  =  —V'(a)  and  in  the  case  of  (15.27),  x  should  not  be  too  close  to 
a  or  to  b. 


15.5  The  Airy  Function  and  the  Connection  Formulas 


317 


Similarly ,  if  1)2  is  a  solution  of  the  Schrodinger  equation  (15.6)  that 
tends  to  zero  near  +00,  then  1)2  can  be  normalized  so  that  the  following 
approximations  hold 


1p2{x) 


1p2(x) 


1p2(x) 


r^> 


1  1 

- ; -  COS  <  —  - 

Vp(x)  1  h 


J  p(y)  dy  +  J  >  (a  <  x  <  b) 


(15.28) 


(2mFih)1/6 

1 


n  Ai  (  ^ (x  ~  I  (near  x  =  b)  (15.29) 


1 


, _ exp  <  —  —  /  q(y)  dy  >  (near  +00)- 

2  y\  hjb  y( 


(15.30) 


Here  Fi  =  V'(b)  and  in  the  case  of  (15.28),  x  should  not  be  too  close  to  a 
or  to  b. 

The  approximate  formulas  for  1)1  and  1)2  will  agree,  up  to  multiplication 
by  a  constant,  in  the  classically  allowed  region  if  and  only  if  we  have 


1 

h 


p(x)  dx 


a 


n  +  i  )  7T 


(15.31) 


for  some  non-negative  integer  n. 


More  specifically,  (15.27)  and  (15.28)  are  equal  when  the  integer  n  in 
(15.31)  is  even  and  they  are  negatives  of  each  other  when  n  is  odd.  Note 
that  there  is  a  factor  of  2  in  the  denominator  in  (15.25)  but  not  in  (15.27); 
this  factor  accounts  for  the  expression  R  =  2ci  in  (15.21). 

Since  the  classical  energy  curve  consists  of  two  “branches,”  of  the  form 
(x,p(x))  and  (x,  —p(x)),  the  compatibility  condition  (15.31)  is  equivalent 
to  Condition  15.1.  Since  the  phase  of  the  approximate  wave  function  in 
the  classically  allowed  region  is  given  by  1  /h  times  the  integral  of  p  dx , 
the  condition  (15.31)  says  that  the  wave  function  goes  through  a  little 
more  than  n  half-cycles  between  the  two  turning  points,  where  a  half-cycle 
corresponds  to  a  change  in  the  phase  in  the  amount  of  7 r,  or  the  interval 
between  two  critical  points  of  the  wave  function.  In  particular,  the  wave 
function  has  exactly  n+1  critical  points  inside  the  classically  allowed  region. 
The  first  and  last  critical  points  occur  slightly  inside  the  turning  points, 
leaving  a  change  in  phase  of  roughly  7t/4  between  the  extreme  critical  point 
and  the  turning  point. 

Figure  15.4  considers  the  same  potential  as  in  Fig.  15.3.  The  figure  shows 
the  WKB  functions  (15.25)  and  (15.27),  together  with  the  scaled  Airy  func¬ 
tion  (15.26),  near  the  turning  point  x  =  a.  Note  that  there  is  a  good  match 
between  the  WKB  functions  and  the  scaled  Airy  function  when  x  is  close 
to,  but  not  too  close  to,  the  turning  point.  Meanwhile,  Fig.  15.5  then  shows 
the  full  approximate  wave  function  with  h  chosen  so  that  (15.31)  holds 
with  n  =  39,  obtained  by  using  the  WKB  functions  away  from  the  turn¬ 
ing  points  and  the  scaled  Airy  functions  near  the  turning  points.  Finally, 
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FIGURE  15.4.  Plots  of  the  scaled  Airy  function  ( thick  curve)  and  the  WKB 
functions,  near  the  turning  point  x  =  a. 


Fig.  15.6  shows  the  probability  distribution  associated  to  the  approximate 
wave  function,  plotted  together  with  the  function  l/p(x).  (Compare  the 
discussion  preceding  Conclusion  15.5.) 

We  now  derive  the  results  in  Claim  15.7.  The  Airy  function  Ai (u)  is 
known  to  have  the  following  asymptotic  behavior: 


Ai  (u)  ~ 


u  — y  Too, 


and 


Ai (u)  ~ 


u  — >  —  00. 


(15.32) 


(15.33) 


For  u  tending  to  —  oo,  the  asymptotics  in  (15.33)  can  be  obtained  by  a 
straightforward  application  of  the  “method  of  stationary  phase,”  as  ex¬ 
plained  in  Exercise  9.  For  u  tending  to  Too,  repeated  integrations  by  parts 
(Exercise  8)  show  that  Ai (u)  decays  faster  than  any  power  of  u ,  which  is  all 
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FIGURE  15.6.  The  probability  distribution  of  the  approximate  wave  function, 
plotted  against  the  function  1  /p(x). 


that  is  strictly  required  for  the  main  theorem  of  Sect.  15.6.  To  obtain  the 
precise  asymptotics  in  (15.32),  one  should  deform  the  contour  of  integra¬ 
tion  to  obtain  a  different  integral  representation  of  Ai(u),  and  then  apply 
some  variant  of  the  method  of  stationary  phase,  such  as  Laplace’s  method 
or  the  method  of  steepest  descent.  See  Sect.  4.7  of  [30]  for  one  approach  to 
this  analysis. 

We  will  use  the  Airy  function  on  an  interval  around  the  turning  points 
with  a  length  that  goes  to  zero  as  h  tends  to  zero  (so  that  the  linear 
approximation  to  the  potential  gets  better  and  better)  but  with  a  length 
that  is  large  compared  to  h 2/3  (so  that  the  value  of  u  at  the  ends  of  the 
interval  will  be  large,  putting  us  into  the  asymptotic  region  of  the  Airy 
function).  See  Sect.  15.6  for  more  information. 

We  use  the  linear  approximation  V(x)  ~  (a  —  x)Fo  to  the  potential  near 
x  =  a,  where  Fq  =  — U7(a),  which  turns  the  Schrodinger  equation  (15.6) 
into  the  Airy  equation,  as  previously  noted.  Now,  the  linear  approximation 
to  V  yields 

p  ~  \J 2mFo^/x  —  a  (15.34) 


and 


1  fx 

-  /  p(y)  dy 

J  CL 


y/2mFo  ( x  —  a)3/2 

h  3/2 


§(-»>3/2. 


From  here  it  is  a  simple  matter  to  check,  using  (15.33),  that 


(15.35) 


tdUAi(“)  35  Ay cos  (h p{v)  dv A 

for  x  >  a,  where  the  approximation  holds  in  an  intermediate  region  where 
x  is  close  to  a  but  not  too  close  to  a.  Thus,  if  we  scale  our  solution  ipi  to 
the  Schrodinger  equation  so  that  it  is  approximated  by  7r1/2(2 mFoh)~1^6 
times  Ai (u)  near  x  =  a,  it  should  satisfy  (15.27)  in  the  classically  allowed 
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region  (but  away  from  the  turning  points).  It  is  then  straightforward  to 
verify,  using  (15.32),  that  this  multiple  of  Ai (u)  satisfies  (15.25)  for  x  near 
— oo.  The  analysis  of  ip2  is  entirely  similar. 

Finally,  to  compare  the  approximations  (15.27)  and  (15.28),  we  note  that 

J  p(y )  ^  +  J  =  (/  p(y) 

where 

i  rb 

=  p(y>  dy  ~  T 2- 

*J  CL 

Now,  if  (j)  is  an  odd  multiple  of  i r,  then  cos(0  —  </>)  =  —  cos#  and  if  <f>  is 
an  even  multiple  of  i r,  then  cos(0  —  </>)  =  cos0.  For  all  other  values  of 
(Exercise  4),  cos(0  —  <f>)  is  not  a  constant  multiple  of  cos0.  Thus,  (15.31) 
is  a  necessary  and  sufficient  condition  for  the  two  approximate  solutions  to 
agree  up  to  a  constant  in  the  classically  allowed  region. 


15.6  A  Rigorous  Error  Estimate 

The  preceding  sections  give  a  treatment  of  the  WKB  approximation  that  is 
typical  of  many  books  in  the  literature.  This  treatment  gives  the  idea  that 
energies  E  satisfying  the  corrected  Bohr-Sommerfeld  Condition  (Condi¬ 
tion  15.1)  should  be  approximate  eigenvalues  for  the  Hamiltonian  operator 
H,  without  specifying  the  sense  in  which  this  approximation  holds.  In  this 
section,  we  prove  a  rigorous  estimate,  as  follows. 

Theorem  15.8  For  any  potential  V  and  range  [Ed,  Ed]  of  energies  sat¬ 
isfying  Assumption  15.3 ,  there  is  a  constant  C  such  that  the  following 
holds.  For  any  energy  E  E  [Ed,  Ed]  satisfying  Condition  15.1,  there  exists 
a  nonzero  function  belonging  to  Dom(iE)  such  that 


\Hi(>-Eil>\\  <  CTA8!!^ 


(15.36) 


As  noted  already  in  Sect.  15.3,  an  estimate  of  the  form  \\Hfj  —  Eip ||  < 
£  \\fj\\  implies  that  there  is  a  point  E  in  the  spectrum  of  H  with  | E  — 
E  <  e.  (See  Exercise  4  in  Chap.  10.)  Since,  under  our  assumptions  on  V, 


the  spectrum  of  H  is  purely  discrete,  we  conclude  that  for  each  number 
E  E  [Ed,  Ed]  satisfying  Condition  15.1,  there  is  an  actual  eigenvalue  E  for 
H  with 


E-E  <  Ch9/8. 


(15.37) 


If  E  satisfies  Condition  15.1,  then  the  estimate  (15.37)  actually  holds 
with  h9/8  replaced  by  h2  on  the  right-hand  side.  It  is  not,  however,  pos¬ 
sible  to  obtain  such  an  optimal  estimate  by  the  methods  we  are  using 
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in  this  chapter.  Specifically,  the  approximate  eigenvector  pj  constructed 
in  the  proof  of  Theorem  15.8  does  not  satisfy  an  estimate  of  the  form 
\\Hip  —  Epj\\  <  Ch2 .  One  can,  however,  construct  an  approximate  eigenvec¬ 
tor  by  different  methods — for  example,  the  method  in  [31] — that  satisfies  an 
order -h2  error  estimate,  for  any  E  satisfying  the  corrected  Condition  15.1. 
Nevertheless,  the  error  bound  in  (15.37)  is  small  compared  to  the  typical 
spacing  between  the  energy  levels,  which  is  of  order  h. 

Recall,  as  we  noted  at  the  beginning  of  Sect.  15.4,  that  a  Schrodinger 

operator  with  potential  V  that  is  smooth  and  tends  to  Too  at  Too  is 

/\ 

essentially  self-adjoint  on  The  operator  H  in  Theorem  15.8  is, 

more  precisely,  the  unique  self-adjoint  extension  of  the  Schrodinger  operator 
defined  on  C£°(R). 


15.6.1  Preliminaries 


Our  construction  of  the  approximate  eigenfunction  ip  will  be  essentially 
by  the  WKB  approximation  as  outlined  in  Claim  15.7.  That  is  to  say, 
we  will  define  pj  using  scaled  Airy  functions  near  the  turning  points  and 
by  the  standard  WKB  functions  in  the  classically  allowed  and  classically 
forbidden  regions.  There  is,  however,  a  difficulty  with  this  approach,  which 
is  that  at  the  boundary  between  different  regions,  the  scaled  Airy  function 
does  not  exactly  match  the  WKB  functions,  but  only  approximately.  What 
this  means  is  that  if  we  define  pj  by  the  WKB  formula  in,  say,  an  interval 
of  the  form  (— oo,a  —  e)  and  we  define  pj  by  a  scaled  Airy  function  on 
(a  —  £,  a  +  £),  then  pj  may  be  discontinuous  at  a  —  e.  Even  if  we  scale  pj 
by  a  constant  on  one  of  these  intervals  to  eliminate  the  discontinuity  in  ip 
itself,  the  derivative  of  pj  will  still  probably  be  discontinuous.  But  if  the 


derivative  of  pj  is  discontinuous,  pj  is  not  actually  in  the  domain  of  i7,  and 
the  left-hand  side  of  (15.36)  does  not  make  sense.  (Compare  Sect.  5.2.) 

The  condition  that  pj'  be  continuous  is  not  just  a  technicality:  If  we 
did  not  worry  about  continuity  of  pj ',  then  we  could  always  match  the 
scaled  Airy  function  to  the  WKB  functions,  just  by  multiplying  the  various 
functions  by  constants,  regardless  of  whether  or  not  the  energy  satisfies  the 
corrected  Bohr-Sommerfeld  Condition.  In  that  case,  we  would  be  claiming 
that  any  number  E  E  [Ei,  E2]  is  within  Ch9P  of  an  eigenvalue  of  i7,  which 
is  false  already  for  the  harmonic  oscillator. 

To  work  around  the  difficulty  described  in  the  previous  paragraphs,  we 
must  put  in  a  transition  region  over  which  we  smoothly  pass  from  one  func¬ 
tion  to  the  other,  using  the  “join”  construction  described  in  Sect.  15.6.4. 
Thus,  we  define  the  function  pj  in  Theorem  15.8  as  follows.  We  use  the 
formulas  in  Claim  15.7  in  the  indicated  intervals,  except  that  multiply 
the  functions  (15.28),  (15.29),  and  (15.30)  by  —1  when  n  is  odd.  We  use 
the  scaled  Airy  functions  (15.26)  and  (15.29)  on  intervals  of  the  form 
(a  —  £,  a  +  e)  and  (b  —  £,  b  +  e),  respectively,  for  some  e  depending  on  h  in  a 
manner  to  be  determined  later.  We  then  put  in  four  transition  regions,  each 
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FIGURE  15.7.  The  approximate  eigenfunction  0,  with  the  transition  regions 
shaded. 

having  length  S,  where  S  also  depends  on  h  in  a  manner  to  be  determined 
later.  The  first  transition  region,  for  example,  is  the  interval  (a  —  e  —  S^a  —  e) 
between  the  first  classically  forbidden  region  and  the  first  turning  point. 
In  each  transition  region,  we  change  over  smoothly  from  one  function  to 
another.  See  Fig.  15.7  for  an  illustration  of  the  transition  regions  around 

the  turning  point  x  =  a. 

/\ 

Suppose  Hq  denotes  the  Schrodinger  operator  with  potential  V,  with 
domain  equal  to  Then,  as  we  have  noted,  Ho  is  essentially  self- 

adjoint,  and  we  are  letting  iT,  which  coincides  with  the  adjoint  operator 
F7q,  denote  the  unique  self-adjoint  extension  of  Ho.  Now,  the  domain  of 
Hq  consists  of  all  functions  0  E  L2(M)  such  that  the  Schrodinger  operator, 
computed  in  the  distributional  sense,  again  belongs  to  L2(M).  In  particular, 

/\  A 

if  0  is  smooth,  then  0  belongs  to  the  domain  of  H  =  Hq  if  and  only  if  0 
is  in  L2(M)  and  —  (^2/2m)0//  T  U0  is  also  in  L2(M). 

Because  of  the  joins,  our  approximate  eigenfunction  is  0  actually  in¬ 
finitely  differentiable  on  all  of  R.  And  since  V(x)  tends  to  Too  at  Too, 
the  exponential  WKB  functions  (15.25)  and  (15.30)  have  rapid  decay  at 
infinity,  which  shows  that  0  is  in  L2(M).  Furthermore,  for  x  near  Too,  the 
calculation  (15.17)  applies,  with  A(x)  =  Cq[x)~x!2 .  We  obtain,  after  a 
short  calculation, 

-  0"(x)  T  V(x)'ip(x) 

2  m 

H2  f  5  /  V'(x)  \ 

2m  0 16  \  V (x)  —  E  ) 

Since  V' /V  and  V" /V  are  assumed  to  be  bounded  near  infinity  and  0(x) 
tends  to  Too  at  Too,  we  see  that  the  Schrodinger  operator  applied  to  0  is 

bounded  by  a  constant  times  0  near  infinity  and  is  thus  square  integrable. 

/\ 

This  shows  that  0  is  in  the  domain  H. 

In  Sect.  15.6.2,  we  will  take  the  width  2e  of  the  region  around  the  turning 
points  to  be  of  order  h1/2.  In  that  case,  the  L 2  norm  of  our  approximate 


1  V"(x) 

4  V(x)  -  E 


0(x). 


(15.38) 
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wave  function  is  of  order  1  (bounded  and  bounded  away  from  zero)  as  ft 
tends  to  zero,  despite  the  blow-up  of  order  ft-1/6  very  near  the  turning 
points.  Although  this  result  is  not  hard  to  verify  (Exercise  10),  if  anything, 
the  norm  would  be  blowing  up  as  ft  tends  to  zero,  which  would  only  help 
us  in  showing  that  \\Hip  —  Eip\\  is  small  compared  to  ||0|  . 

To  prove  Theorem  15.8,  we  must  estimate  the  contributions  to  the  quan- 
tity  \\Hip  —  Eip ||  from  four  different  types  of  regions:  the  classically  allowed 
region,  the  classically  forbidden  regions,  the  regions  near  the  turning  points, 
and  the  transition  regions.  These  estimates  will  occupy  the  remainder  of 
this  section,  with  the  analysis  in  the  transition  regions  being  the  most  in¬ 
volved.  In  particular,  it  is  essential  that  the  derivative  of  scaled  Airy  func¬ 
tion  almost  match  the  derivative  of  the  WKB  function  in  the  transition 
region,  as  in  the  second  part  of  Lemma  15.9. 


15.6.2  The  Regions  Near  the  Turning  Points 

We  use  a  scaled  Airy  function  in  an  interval  around  each  turning  point. 
[We  use  (15.26)  near  x  =  a  and  either  (15.29)  or  the  negative  thereof  near 
x  =  5,  depending  on  whether  n  is  even  or  odd.]  We  now  verify  that  taking 
these  intervals  to  have  length  of  order  ft1/2  will  give  satisfactory  estimates. 
If  ip  denotes  one  of  the  scaled  Airy  functions,  then  0  satisfies  a  Schrodinger 
equation  in  which  the  potential  V  is  replaced  by  a  linear  approximation  V 
near  one  of  the  turning  points,  which  means  that 

Hip  —  Eip  =  (V(x)  -  V(x))ip.  (15.39) 

The  difference  between  V (x)  and  its  linear  approximation  V (x)  grows  at 
most  quadratically  with  the  distance  from  the  turning  point.  Meanwhile, 
the  asymptotics  of  the  Airy  function  tell  us  that  it  can  be  bounded  as 
|Ai(n)|  <  Cu~xP.  (This  is  terrible  estimate  for  small  n,  but  still  true.) 
Now  u ,  as  defined  in  (15.22),  is  of  order  ft-2/3  times  the  distance  to  the 
turning  point.  Since,  also,  there  is  factor  of  ft-1/6  in  (15.26)  and  the  distance 
from  the  turning  point  is  at  most  of  order  ft1/2,  we  find  that 

| HiP  -  Eip\  <  C(ft1/2)2ft“1/6(ft“2/3ft1/2)“1/4  =  Ch7/8 


over  the  interval  around  each  turning  point.  Finally,  if  a  function  /  satisfies 
|/|  <  Donan  interval  of  length  L,  then  the  L2  norm  of  /  over  that  interval 
will  be  at  most  D\/T.  Thus,  over  the  interval  around  the  turning  points, 


HiP-EiP\\  =  0(ft7/8ft1/4)  =  0(ft9/8). 


15.6.3  The  Classically  Allowed  and  Classically  Forbidden 
Regions 

/\ 

The  expression  (15.38)  for  Hip  —  Eip,  derived  from  (15.17),  applies  both  in 
the  classically  allowed  region  and  in  the  classically  forbidden  regions.  Let  us 
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consider  first  the  classically  allowed  region.  Although  (15.38)  is  nominally 
of  order  ft2 ,  we  use  this  expression  on  an  interval  whose  ends  get  closer  and 

closer  to  the  turning  point  as  ft  tends  to  zero.  Since,  also,  the  expression 

/\ 

in  (15.38)  is  blowing  up  at  the  turning  points,  the  contribution  to  \\Hip  — 
Eip  ||  from  this  interval  is  of  order  larger  than  h2 . 

We  have  taken  the  interval  around  the  turning  point  to  have  length  2s 
that  is  of  order  ft1/2,  and  we  will  also  take  (Sect.  15.6.4)  the  transition 
regions  to  have  length  S  that  is  of  order  ft1/2.  Thus,  we  use  the  oscillatory 
WKB  function  on  an  interval  of  the  form  (a  +  7,  b  —  7),  where  7  =  e  +  6  is 
of  order  ft1/2.  Now,  the  formula  for  ip  in  the  classically  allowed  regions  has 
a  factor  of  1/ yjp{x)  times  a  bounded  quantity  (the  cosine  factor).  Since 
V'(a )  is  assumed  to  be  nonzero,  V(x)  —  E  behaves  like  a  constant  times 
(x  —  a)  and  so  1/ yjp(pc)  behaves  like  a  constant  time  (x  —  a)-1/4  for  x 
approaching  a,  with  similar  behavior  near  the  other  turning  point. 

Meanwhile,  the  more  problematic  term  in  (15.38)  is  the  term  having 
(V(x)  —  E)2  in  the  denominator.  Keeping  in  mind  the  1/y/p  blowup  of  ip 
itself,  this  term  behaves  like  (x  —  a)-9/4  as  x  approaches  a.  Thus,  we  may 
estimate  the  norm  of  Hip  —  Eip  over  the  left  half  of  the  classically  allowed 
region  as 


\Hip-Eip\\  <Cft 


*a+7 


1/2 


(x  —  a)  9/2  dx 

(a+6)  /  2 

=  C'ft2(j-7/2  -  ((a  +  b) /2)7/2)1/2 


Since  7  is  of  order  ft1/2,  the  contribution  to  || Hip  —  Eip ||  from  the  interval 
(a +  7,  (a +  6)/ 2)  will  consist  of  a  term  of  order  ft2ft~‘/8  =  ft9/8,  plus  lower- 
order  terms.  The  estimate  over  the  other  half  of  the  classically  allowed 
region  is  similar. 

Meanwhile,  in  the  first  classically  forbidden  region,  we  also  apply  (15.38). 
By  Assumption  15.3,  V' /V  and  V" /V  are  bounded  near  infinity.  Thus, 
V' / (V  —  E)  and  V" /{V  —  E)  will  also  be  bounded  near  infinity,  and  thus 

also  bounded  on  (—00,  a  —  1),  since  V  —  E  is  strictly  positive  on  this  interval 

/\ 

and  tends  to  +00  as  x  tends  to  —00.  We  see,  then,  that  the  norm  of  Hip— Eip 
over  (—00,  a  —  1)  is  bounded  by  a  constant  times  ft2  \\ip\\  . 

The  norm  of  Hip  —  Eip  over  an  interval  of  the  form  (a  —  1,  a  —  7)  can  be 
analyzed  similarly  to  the  classically  allowed  region.  The  estimates  from  this 

region  are  better,  however,  because  of  the  exponentially  decaying  factor  in 

/\ 

the  definition  of  the  WKB  function.  Thus,  the  contribution  to  ||  Hip  —  Eip  || 
from  the  classically  forbidden  region  (—00,  a  —  7)  is  certainly  no  larger  than 
order  ft9/8,  and  similarly  for  the  other  classically  forbidden  region. 
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FIGURE  15.8.  The  join  of  two  functions  over  the  interval  [a,  a  +  5]  ( thick  curve). 


15.6.4  The  Transition  Regions 

Given  two  smooth  functions  ^1  and  ip2  and  some  interval  of  the  form 
a,  a  +  5],  we  now  define  a  “join”  vpi  LI  ^2  of  Vh  and  ^2,  where  ijj  1  U  ip  2(2) 
is  equal  to  ipi  (%)  for  x  <  a  and  equal  to  ip2  {%)  for  x  >  a  +  S,  and  where 
ipi  U  ip 2  is  smooth  everywhere.  Let  y  be  a  smooth  function  on  [0, 1]  that  is 
identically  equal  to  0  in  a  neighborhood  of  0  and  identically  equal  to  1  in 
a  neighborhood  of  1.  Then  define  ipi  U  ^2  by 


(4>i  u  i>2){x)  =  4>i{x)  +  {4>2{x)  -  ip i(x))x((x  -  a)/5). 

(See  Fig.  15.8.)  By  direct  calculation,  we  have 

(H  -  EI)(ip i  U  V>2)  =  {Hip i  -  Eipx)  U  (Hip2  -  Eip2) 

-  -  Vl(t))x'((z  -  a)/ 5) 

o  m 

-  ^2  i{x))x"{{x  ~  a)/5).  (15.40) 

In  our  constructing  our  approximate  eigenfunction,  we  use  five  different 
formulas  in  five  different  regions:  the  two  classically  forbidden  regions,  the 
classically  allowed  region,  and  the  regions  near  the  two  turning  points.  Since 
none  of  these  functions  exactly  matches  the  function  in  the  next  interval, 
we  put  in  a  total  of  four  joins  in  order  to  produce  a  function  that  is  in  the 
domain  of  H.  We  choose  the  width  S  of  the  interval  on  which  the  join  takes 
place  to  be  of  the  same  size  as  the  intervals  around  the  turning  points, 
namely,  order  h1/2 . 

The  most  critical  case  is  the  transition  from  the  region  near  the  turning 
points  to  the  classically  allowed  region.  Consider,  for  example,  the  scaled 
Airy  function  ?/t  in  (15.26)  and  the  oscillatory  WKB  function  ip2  in  (15.27). 
There  are  two  contributions  to  the  mismatch  between  these  two  functions. 
First,  there  is  a  discrepancy  between  the  Airy  function  and  its  leading- 
order  asymptotics.  Second,  there  is  an  error  in  the  approximations  (15.34) 
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and  (15.35),  which  come  from  the  discrepancy  between  the  potential  V(x) 
and  its  linear  approximation  V(x)  near  x  =  a.  We  need  to  consider  both 
contributions  to  the  mismatch  in  our  estimation  of  ifi  —  1)2  and  of  if[  —  if'2. 

Lemma  15.9  Let  if\  denote  the  scaled  Airy  function  in  (15.26),  let  ?/t 
denote  the  same  function  with  the  Airy  function  replaced  by  the  right-hand 
side  of  (15.33),  and  let  if  2  denote  the  oscillatory  WKB  function  in  (15.27). 
If  x  —  a  is  positive  and  of  order  ft1/2,  we  have 

ifi(x)  —  i>i(x)\  =  (^(ft1/8) 
ifi{x)  —  ^2(^)1  =  0(ft1/8) 


and 


Vl(z)  -  l'i(x) 

1p'l(x)  -  1p2(x) 


0(h~5/8) 

0(h~5/8). 


Before  giving  the  proof  of  this  lemma,  let  us  verify  that  these  estimates 

/s 

are  sufficient  to  control  the  contribution  to  ||  Hip  —  E if  ||  from  the  transition 
region  (a  +  £,  a  +  e  +  S)  between  the  first  turning  point  and  the  classically 
allowed  region,  where  both  e  and  5  are  taken  to  be  of  order  ft1/2.  We  must 
consider  each  of  the  three  lines  in  (15.40).  The  L 2  norm  of  the  first  line  is 
of  order  at  most  ft9/8,  by  precisely  the  same  argument  as  in  Sect.  15.6.3. 

For  the  second  and  third  lines,  we  recall  that  if  a  function  /  is  bounded 
by  C ,  then  the  L2  norm  of  /  over  an  interval  of  length  L  is  at  most  C\/~L. 
Since  we  are  taking  the  length  5  of  our  transition  interval  to  be  of  order 
ft1/2,  the  L2  norm  of  the  second  line  of  (15.40)  is  of  order 

-A^“5/8^1/4  =  ft9/8- 

hL/z 

Meanwhile,  the  contribution  from  the  third  line  of  (15.40)  is  of  order 

T2W8W4  =  hll/&. 

ft 

Thus,  the  contribution  to  1 1  Hif  —  Eif  \ \  from  the  transition  region  (a  +  £,  a-f 
s  +  6)  is  of  order  at  most  ft9/8. 

The  analysis  of  the  transition  between  the  classically  allowed  region  and 
the  region  around  x  —  b  is  entirely  similar.  The  analysis  of  the  transitions 
between  the  regions  near  the  turning  points  and  the  classically  forbidden 
regions  is  also  similar,  but  much  less  delicate,  because  all  of  the  functions 
involved  are  very  small  in  the  transition  region.  When  (a  —  x)  is  positive 
and  of  order  ft1/2,  for  example,  u ,  as  defined  in  (15.22)  will  be  of  order  ft-1/6 
and  so  u3^2  is  of  order  ft-1/4.  Thus,  the  exponential  factor  in  leading-order 
asymptotics  of  the  Airy  function  for  u  >  0  will  behave  like  exp  (— Cft  1/4), 
which  is  very  small  for  small  ft,  certainly  smaller  than  any  power  of  ft.  Since 
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all  the  factors  in  front  of  the  exponential  will  behave  like  ft  to  a  power,  the 

/\ 

overall  contribution  to  \\Hip  —  Eip ||  from  the  transition  between  the  region 
near  the  turning  points  and  the  classically  forbidden  region  is  smaller  than 
any  power  of  ft.  Thus,  none  of  the  transition  regions  contributes  an  error 
worse  that  0(ft 9/8). 

Proof  of  Lemma  15.9.  We  consider  only  the  estimates  for  the  derivatives 
of  the  functions  involved.  The  analysis  of  the  functions  themselves  is  similar 
(but  easier)  and  is  left  as  an  exercise  to  the  reader  (Exercise  11). 

We  begin  by  considering  —  With  a  little  algebra,  we  compute  that 

— lp- - =  — v/7r(2mi7o)1^6^_5^6(Ai/  ( u )  —  Ai  ( u ))  (15.41) 

ax  ax 

where  u  is  as  in  (15.22)  and  where  Ai  is  the  function  on  the  right-hand  side 
of  (15.33). 

Now,  Ai (u)  has  an  asymptotic  expansion  for  u  —  oo  given  by 

Ai(^)  =  Ai(u)(l  +  Cu~ 3/2  +  •••), 

and  Ai7(u)  has  the  asymptotic  expansion  obtained  by  formally  differenti¬ 
ating  this  with  respect  to  u.  [See  Eq.  (7.64)  in  [30].]  From  this,  we  obtain 

A i  (u)  —  Ai  (u)  =  Ai  (u)0((— u)~3^2)  +  Ai(u)0((— 'u)-5/2).  (15.42) 

From  the  explicit  formula  for  Ai,  we  see  that  Ai (u)  is  of  order  (— 'a)-1/4. 

Meanwhile,  the  formula  Ai  (u)  will  contain  two  terms,  the  larger  of  which 
will  be  of  order  u 1^4.  Thus,  the  slower-decaying  term  on  the  right-hand  side 
of  (15.42)  is  the  first  one,  which  is  of  order  (— u)~rj/4.  Now,  in  the  transition 
regions,  u  behaves  like  h~2^h1/2  =  ft-1/6 .  Thus,  (15.42)  goes  like  ft5/24  and 
so  (15.41)  goes  like  h  5/6+5/24  =  fi  5/8?  as  claimed. 

We  now  consider  By  direct  calculation,  the  derivatives  of  ?/q 

and  ip2  each  consist  of  two  terms,  a  “dominant”  obtained  by  differentiating 
the  cosine  factor  and  a  “subdominant”  term  obtained  by  differentiating  the 
coefficient  of  the  cosine  factor.  In  the  case  of  ^i,  the  dominant  term  in  the 
derivative  may  be  simplified  to 

-  l((2mF0)(a;  -  a))1/4  sin  0( -uf1 2  -  .  (15.43) 

According  to  Exercise  12,  we  have,  when  x  —  a  is  of  order  ft1/2,  the 
estimates 

((2mFo)(a  —  x ))1/4  =  p  +  v/pO(ft1/2)  (15.44) 

l(-U)3/2  =  \j  p(v)  dv  +  o{h1/4). 


and 


(15.45) 
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Since  the  derivative  of  sin  6  is  bounded,  a  change  of  order  h 1/4  in  the 
argument  of  a  sine  function  produces  a  change  of  order  h1/4  in  the  value 
of  the  sine.  Thus,  if  we  substitute  (15.44)  and  (15.45)  into  (15.43),  we  find 
that  the  difference  between  the  dominant  term  in  and  the  dominant 
term  in  is 


1 

h 


y/pO(h1//4)  +  lower-order  terms. 


Since  ^/p  is  of  order  (x  —  a)1/4  or  v/s,  we  get  an  error  of  order  h  5/8,  as 
claimed. 

Finally,  the  subdominant  terms  in  the  derivatives  of  ?/q  and  are  easily 
seen  to  be  separately  of  order  h~ 5/8.  Thus,  even  without  taking  into  account 
the  cancellation  between  these  terms,  they  do  not  change  the  order  of  the 
estimate.  ■ 


15.6.5  Proof  of  the  Main  Theorem 

/\ 

We  have  estimated  the  contributions  to  \\EEf  —  Eip ||  from  each  type  of 
region:  classically  allowed  and  classically  forbidden  regions,  the  regions 
around  the  turning  points,  and  the  transition  regions.  In  each  case,  we  have 
found  a  contribution  that  is  of  order  at  most  h9P  \\f>\\  •  Thus,  it  remains 
only  to  verify  that  the  constants  in  all  estimates  are  bounded  uniformly 
over  the  given  range  E\  <  E  <  E2  of  energies. 

This  verification  is  straightforward.  Near  the  turning  point  x  =  a,  for 
example,  we  need  to  estimate  the  difference  between  the  potential  V(x) 
and  its  linear  approximation  V(x)  near  x  =  a.  As  a  consequence  of  the 
Taylor  remainder  formula,  \V(x)  —  V(x)\  will  be  bounded  by  C  \  x  —  a|2  /2, 
where  C  is  the  maximum  of  |1F//(x)|  over  the  interval  from  a  to  x.  As  E 
varies  over  [Td,^],  the  set  of  points  where  we  have  to  evaluate  \V"{x) 
will  be  bounded,  meaning  that  C  can  be  taken  to  be  independent  of  E ,  for 
E  in  such  a  range. 

Similarly,  in  the  classically  allowed  region,  the  blow-up  of  l/(V(x)  —  E)2 
near  x  =  a(E)  can  be  controlled  by  the  minimum  of  \V' (y)  \  for  y  between  a 
and  x.  By  assumption,  |'F/(x)|  >  0  at  all  the  turning  points  a(E)  and  b{E) 
with  Ei  <  E  <  F?2,  and  thus,  by  continuity,  in  some  neighborhood  of  that 
set  of  turning  points.  Thus,  blow-up  of  1/(V (x)  —  E)2  will  be  controlled  by 
the  minimum  of  |W(:r)|  on  an  interval  of  the  form  [<2(^2)  +  a,a(Ei)  +  a] 
for  some  small  a  >  0.  The  remaining  details  of  this  verification  are  left  to 
the  reader. 


15.7  Other  Approaches 

The  main  complicating  factor  in  the  WKB  approximation  is  the  singular 
behavior  near  the  turning  points.  The  turning  points,  meanwhile,  are  only 
problematic  because  we  are  working  in  the  position  representation.  The 
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turning  points,  after  all,  are  the  points  on  the  classical  trajectory  where 
the  position  of  the  particle  achieves  a  maximum  or  a  minimum.  If  we  were 
to  work  in  the  momentum  representation,  the  points  where  the  momen¬ 
tum  achieves  a  maximum  or  a  minimum  would  instead  be  the  problematic 
points.  A.  Voros  [42]  has  proposed  working  in  the  Segal-Bargmann  repre¬ 
sentation  (Sect.  14.4).  In  Voros’s  analysis,  there  are  no  turning  points  and, 
thus,  the  analysis  is  much  simpler.  The  problem  with  Voros’s  approach  is 
that  he  only  gives  an  approximation  to  the  wave  function  on  the  classical 
energy  curve.  Even  in  simple  cases,  Voros’s  expression  does  not  admit  a 
holomorphic  extension  to  the  whole  plane,  but  has  branching  behavior  in¬ 
side  the  classical  energy  curve.  Thus,  Voros’s  formula  does  not  define  an 
element  of  the  quantum  Hilbert  space  (which  is  a  space  of  entire  holomor¬ 
phic  functions),  let  alone  an  element  of  the  domain  of  the  Hamiltonian. 

Nevertheless,  it  is  possible  to  build  approximate  eigenfunctions  as  su¬ 
perpositions  of  coherent  states,  using  formulas  similar  to  those  in  Voros. 
This  approach  avoids  dealing  with  turning  points  but  still  yields  a  rigorous 
eigenvalue  estimate,  with  the  same  corrected  Bohr-Sommerfeld  condition 
as  in  Condition  15.1.  See  [31,  23,  7],  or  (in  greater  generality)  [26]. 


15.8  Exercises 

1.  Show  that  if  c\  is  any  complex  number,  then  we  have  an  identity  of 
the  form 

c\eie  +  c{e~ld  =  R  cos  (6  —  (5) 
for  some  real  numbers  R  and  S. 

2.  Let  H(x,p)  =  p2 /2m  +  muo2x2 / 2  be  the  Hamiltonian  for  a  harmonic 
oscillator  having  mass  m  and  classical  frequency  c o.  Show  that  a  pos¬ 
itive  number  E  satisfies  the  corrected  Bohr-Sommerfeld  condition 
(Condition  15.1)  if  and  only  if  E  is  of  the  form  (n-f  1/2 )/kJ,  where  n 
is  a  non- negative  integer. 

Note :  In  light  of  the  results  of  Chap.  11,  this  calculation  means  that, 
in  this  very  special  case,  the  corrected  Bohr-Sommerfeld  condition 
gives  the  exact  eigenvalues  of  the  quantum  Hamiltonian  H . 

3.  Suppose  A  and  p  are  two  nonzero,  smooth  functions  satisfying  (15.15). 
Show  that  A(x)  =  C(p(x))~1^2  for  some  constant  C. 

Hint :  Think  in  terms  of  the  logarithms  of  the  functions  involved. 

4.  Show  that  cos(0  —  (5),  viewed  as  a  function  of  0,  agrees,  up  to  mul¬ 
tiplication  by  a  constant,  with  cos(0  —  V)  if  and  only  if  <5  —  is  an 
integer  multiple  of  i r. 
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A 

5.  If  ip  is  an  eigenvector  for  H  that  is  approximated  by  (15.25)  near 
—oo,  one  might  hope  to  find  an  approximate  expression  for  ip  in 
the  classically  allowed  region  by  analytically  continuing  around  the 
turning  point  in  the  complex  plane.  Even  assuming  V  is  analytic, 
however,  it  is  fairly  evident  that  analytic  continuation  in  the  upper 
half-plane  does  not  give  the  same  answer  as  in  the  lower  half-planes. 
Nevertheless,  one  could  use  the  average  of  the  upper  and  lower  half¬ 
plane  results  as  a  (totally  nonrigorous)  guess  for  the  behavior  of  ip  in 
the  classically  allowed  region. 

Show  that  the  above  approach  gives  the  correct  phase  S  in  the  con¬ 
nection  formula  (15.21)  but  is  off  by  a  factor  of  2  in  the  amplitude  R. 

6.  Using  integration  by  parts,  show  that  the  limit 


>A 


t 


lim  /  cos - b  ut\  dt 

A— ^+oo  Jq  V  3 


exists. 

Hint :  Multiply  and  divide  by  t2  +u  (avoiding  points  where  t2  +u  =  0 
in  the  case  u  <  0). 

7.  In  this  exercise,  we  sketch  an  argument  that  the  Airy  function  in 
(15.24)  satisfies  the  differential  equation  ip"(u)  —  uip(u)  =  0.  For 
the  purposes  of  this  exercise,  let  us  say  that  J0°°  f(t)  dt  =  C  if 

f0A  f(t)  dt  =  C  +  g(A ),  where  the  function  g  is  bounded  and  oscillates 
around  an  average  value  of  zero. 

Assuming  that  it  is  legal  to  differentiate  under  the  integral  sign,  verify 
that  Ai (u)  satisfies  the  stated  equation. 

Hint :  After  differentiating  under  the  integral,  look  for  a  term  that 
can  be  integrated  explicitly. 

Note :  A  more  rigorous  approach  to  this  verification  would  be  to  in¬ 
tegrate  by  parts  as  in  Exercise  6  and  then  differentiate  under  the 
integral.  This  approach  is,  however,  a  bit  messier. 

8.  By  integrating  by  parts  repeatedly  in  (15.24),  show  that  Ai (u)  decays 
faster  than  any  power  of  u  as  u  tends  to  -boo. 

Hint :  A  key  point  is  to  show  that  the  boundary  terms  in  the  integra¬ 
tion  by  parts  vanish  at  every  stage.  After  performing  the  integrations 
by  parts,  estimate  the  resulting  integral  by  using  the  inequality 


1 


< 


1 


1 


(t2  +  u)n  (t2jr\)kUn  k'> 


u  >  1, 


for  some  appropriate  choice  of  k. 
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9.  (a)  For  u  <  0,  make  the  change-of- variable  r  =  t/y/—u  in  the 

integral  formula  for  the  Airy  function,  to  obtain  the  expression 


Ai('u)  = 


V~u 


•  oo 


T 


cos  a 


—  r 


7 r 


ro 


hr, 


(15.46) 


where  a  =  (—u)3^2. 

(b)  Suppose  /  is  a  smooth  function  on  [a,  b }  having  a  unique  critical 
point  xq-  Assuming  that  Xq  is  in  the  interior  of  [a,  b]  and  that 
f"(x o)  7^  0,  the  method  of  stationary  phase  asserts  that 


dx  -  +  O  (i) 

for  a  tending  to  Too,  where  the  plus  sign  in  the  exponent  is  taken 
when  f"(x o)  >  0  and  the  minus  sign  is  taken  when  f"[x o)  <  0. 
(See,  e.g.,  Eq.  (5.12)  in  [30].) 

Using  this  result,  obtain  the  asymptotic  formula  (15.33). 

Hint :  Divide  the  integral  in  (15.46)  into  an  integral  over  [0,  2]  and  an 
integral  over  [2,oo).  Use  stationary  phase  for  the  first  interval  and 
integration  by  parts  (as  in  Exercise  6)  for  the  second  interval. 

/s 

10.  Let  ip  be  the  approximate  eigenfunction  for  H  defined  in  the  begin¬ 
ning  of  Sect.  15.6.  Show  that  the  norm  of  ip  is  bounded  and  bounded 
away  from  zero  as  h  tends  to  zero. 

Hint :  First  show  that  the  L 2  norm  of  ^  over  the  intervals  around 
the  turning  points  goes  like  ft-1/6/?1/4.  Then  check  that  the  functions 
p(x )-1/2  and  q(x)~1/2  are  square  integrable  near  the  turning  points. 


11.  By  imitating  the  arguments  in  the  proof  of  Lemma  15.9,  prove  the 
estimates  for  ipi  —  ipi  and  ?/q  —  hr  the  lemma. 

12.  By  writing  V (x)  as  F^ia—x)  plus  an  error  term  of  order  (x— a)2,  verify 
that  the  estimates  (15.44)  and  (15.45)  in  the  proof  of  Lemma  15.9 
hold  in  the  transition  region.  (Assume  that  x  —  a  is  of  order  h1/2  in 
the  transition  region.) 

Hint :  The  leading-order  Taylor  expansion  of  (1  -\-z)a  is  1  -faz-f-0(z2), 
for  any  real  number  a. 
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Lie  Groups,  Lie  Algebras,  and 
Representations 


An  important  concept  in  physics  is  that  of  symmetry ,  whether  it  be 
rotational  symmetry  for  many  physical  systems  or  Lorentz  symmetry  in 
relativistic  systems.  In  many  cases,  the  group  of  symmetries  of  a  system  is 
a  continuous  group ,  that  is,  a  group  that  is  parameterized  by  one  or  more 
real  parameters.  More  precisely,  the  symmetry  group  is  often  a  Lie  group , 
that  is,  a  smooth  manifold  endowed  with  a  group  structure  in  such  a  way 
that  operations  of  inversion  and  group  multiplication  are  smooth.  The  tan¬ 
gent  space  at  the  identity  in  a  Lie  group  has  a  natural  “bracket”  operation 
that  makes  the  tangent  space  into  a  Lie  algebra.  The  Lie  algebra  of  a  Lie 
group  encodes  many  of  the  properties  of  the  Lie  group,  and  yet  the  Lie 
algebra  is  easier  to  work  with  because  it  is  a  linear  space. 

In  quantum  mechanics,  the  way  symmetry  is  encoded  is  usually  through 
a  unitary  action  of  the  group  on  the  relevant  Hilbert  space.  That  is,  we 
assume  we  are  given  a  unitary  representation  of  the  relevant  symmetry 
group  G,  that  is,  a  continuous  homomorphism  of  G  into  U(H),  the  group 
of  unitary  operators  on  the  quantum  Hilbert  space  H.  Actually,  since  two 
unit  vectors  in  H  that  differ  only  by  a  constant  represent  the  same  physi¬ 
cal  state,  we  should  more  properly  consider  projective  unitary  representa¬ 
tions.  A  projective  representation  is  a  homomorphism  of  a  group  G  into 
U(H)/U(1),  where  U(l)  is  the  group  of  complex  numbers  of  magnitude  1, 
thought  of  multiples  of  I  in  U(H).  An  ordinary  or  projective  representa¬ 
tion  of  a  Lie  group  gives  rise  to  an  ordinary  or  projective  representation 
of  its  Lie  algebra.  The  angular  momentum  operators,  for  example,  form  a 
representation  of  the  Lie  algebra  of  the  rotation  group. 
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Saying  that,  for  example,  the  Hamiltonian  operator  of  a  quantum  system 
is  invariant  under  rotations  means  that  H  commutes  with  the  relevant 
representation  of  the  rotation  group  and  thus  also  with  the  associated  Lie 
algebra  operators.  This  commutativity,  in  turn,  implies  that  the  eigenspaces 
for  H  are  invariant  under  rotations.  We  will  use  this  commutativity  in 
Chap.  18  to  help  us  in  determining  the  energy  eigenvectors  for  the  hydrogen 
atom. 

In  this  chapter,  we  will  make  a  brief  survey  of  Lie  groups,  Lie  algebras, 
and  their  representations.  For  our  purposes,  it  suffices  to  consider  matrix 
Lie  groups ,  those  that  can  be  realized  as  closed  subgroups  of  the  group  of 
n  x  n  invertible  matrices.  Inevitably,  I  have  had  to  present  some  of  the 
deeper  results  without  proof.  Proofs  of  all  results  stated  here  can  be  found 
in  [21].  The  results  of  this  chapter  will  be  put  to  use  in  Chap.  17,  in  our 
study  of  angular  momentum,  and  in  Chap.  18,  in  our  study  of  the  hydrogen 
atom. 


16.1  Summary 

In  this  chapter,  we  will  consider  a  matrix  Lie  group  G,  which  is,  by  defini¬ 
tion,  a  (topologically)  closed  subgroup  of  some  GL(n;  C),  where  GL(n;  C)  is 
the  group  of  n  x  n  invertible  matrices  with  complex  entries.  To  each  such 
G,  we  will  associate  the  Lie  algebra  £j  of  G,  where  £j  is  a  real  subspace  of 
Mn(C),  the  space  of  all  n  x  n  matrices.  We  will  see  that  G  is  automatically 
an  embedded  real  submanifold  of  Mn(C)  and  that  £j  is  the  tangent  space 
of  G  at  the  identity  matrix. 

Now,  g  is  not  just  a  real  vector  space,  but  comes  with  a  “bracket”  opera¬ 
tion  mapping  gxg  into  $.  Specifically,  we  will  show  that  for  all  X  and  Y  in 
£j,  the  matrix  XY  —  YX  belongs  again  to  Thus,  we  define  our  bracket  by 
setting  [X,  Y]  equal  to  XY  —  YX.  As  it  turns  out,  the  Lie  algebra  as  a 
vector  space  with  the  bracket  operation,  encodes  a  lot  of  information  about 
the  group  G.  On  the  other  hand,  computing  at  the  level  of  the  Lie  algebra 
is  generally  easier  than  computing  at  the  group  level,  simply  because  g  is 
a  linear  space. 

We  will  be  interested  in  unitary  representations  of  our  group  G,  that  is, 
continuous  homomorphisms  of  G  into  U(H),  the  group  of  unitary  operators 
on  a  Hilbert  space.  If  we  restrict  attention,  at  first,  to  the  case  in  which 
H  is  finite  dimensional,  then  each  representation  n  of  G  gives  rise  to  a 
representation  i r  of  the  Lie  algebra  $  of  G.  That  is  to  say,  tt  is  a  linear 
map  of  $  into  the  space  of  linear  maps  of  V  to  V,  satisfying  t r([X,  Y])  = 
7r(X),7 r(y)].  A  deeper  question  is  whether  every  representation  tt  of  $ 
comes  from  a  representation  n  of  G.  As  it  turns  out,  the  answer  in  general 
is  no,  but  the  answer  is  yes  if  G  is  simply  connected. 
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We  may  consider,  for  example,  the  case  G  =  SO (3).  This  group  is  not 
simply  connected.  On  the  other  hand,  the  Lie  algebra  so(3)  of  SO (3)  is  iso¬ 
morphic  to  the  Lie  algebra  su(2)  of  SU(2),  and  SU(2)  is  simply  connected. 
[That  is,  SU(2)  is  the  “universal  cover”  of  SO (3).]  Thus,  given  a  represen¬ 
tation  7 r  of  so(3),  there  may  or  may  not  be  an  associated  representation  II 
of  SO (3).  Even  if  there  is  not,  however,  there  is  always  a  representation  IT 
of  the  group  SU(2). 

In  quantum  mechanics,  the  vector  el0fj  represents  the  same  physical 
state  as  fj.  Thus,  it  is  natural  to  consider  “projective”  unitary  representa¬ 
tions,  that  is,  homomorphisms  of  G  into  the  quotient  group  U(H) / {el° 1} . 
In  the  finite-dimensional  case,  each  projective  representation  can  be  “de- 
projectivized”  at  the  level  of  the  Lie  algebra  $  of  G.  We  can  then  pass 
from  the  Lie  algebra  to  the  universal  cover  of  G,  that  is,  the  simply  con¬ 
nected  group  with  Lie  algebra  $.  In  particular,  in  the  finite-dimensional 
case,  the  irreducible  projective  unitary  representations  of  SO (3)  are  in  one- 
to-one  correspondence  with  irreducible  ordinary  unitary  representations  of 
the  universal  cover  SU(2)  of  SO (3).  Although  the  Hilbert  spaces  of  phys¬ 
ical  systems  are  usually  infinite  dimensional,  for  compact  groups  such  as 
SO (3),  general  unitary  representations  can  be  decomposed  as  direct  sums 
of  finite-dimensional  ones.  (See,  e.g.,  Proposition  17.19  and  the  discussion 
following  it.) 

16.2  Matrix  Lie  Groups 

Let  Mn(C)  denote  the  space  of  n  x  n  matrices  with  complex  entries.  We 

2 

identify  Mn(C)  with  Cn  ,  equipped  with  the  usual  topology.  Thus,  a  se¬ 
quence  Am  in  Mn(C)  converges  to  a  matrix  A  E  Mn(C)  if  (Am)jk  converges 
to  Ajk  as  m  tends  to  infinity,  for  all  1  <  j,  k  <  n.  Let  GL(n;  C)  denote  the 
general  linear  group ,  consisting  of  all  invertible  n  x  n  matrices  with  com¬ 
plex  entries.  Then  GL(n;C)  forms  a  group  under  the  operation  of  matrix 
multiplication.  Furthermore,  GL(n;C) — that  is,  the  set  of  A  E  Mn(C)  with 
det  A  ^  0 — is  an  open  subset  of  Mn(C).  Since  Mn(C)  is  a  complex  vector 
space  of  dimension  n2,  it  may  be  identified  with  Cn  =  M2n  .  Since  GL(n;  C) 
is  an  open  subset  of  Mn( C),  it  looks  locally  like  M2n  and  is  therefore  a  real 
manifold  of  dimension  2 n2. 

Definition  16.1  A  subgroup  G  o/GL(n;C)  is  closed  if  for  each  sequence 
Arn  in  G  that  converges  to  a  matrix  A,  either  A  is  again  in  G  or  A  is  not 
invertible.  A  matrix  Lie  group  is  a  closed  subgroup  of  some  GL(n;C). 

A  subgroup  G  of  GL(n;  C)  is  closed  if  it  is  topologically  closed  as  a  subset 
of  GL(n;C) — but  not  necessarily  as  a  subset  of  Mn(C).  We  will  see  that 
each  matrix  Lie  group  is  a  real  embedded  submanifold  of  GL(n;  C)  and  thus 
is  a  Lie  group. 
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Definition  16.2  If  G\  and  G 2  are  matrix  Lie  groups ,  then  a  Lie  group 
homomorphism  of  G\  to  G2  is  a  continuous  group  homomorphism  of  G\ 
into  G2.  A  Lie  group  homomorphism  is  called  a  Lie  group  isomorphism 
if  it  is  one-to-one  and  onto  with  continuous  inverse.  Two  matrix  Lie  groups 
are  called  isomorphic  if  there  exists  a  Lie  group  isomorphism  between 
them. 


Example  16.3  The  real  general  linear  group,  denoted  GL(n,M),  is  the 
group  of  invertible  n  x  n  matrices  with  real  entries.  The  groups  SL(n,C) 
and  SL(n,M)  are,  respectively,  the  groups  of  complex  and  real  matrices  with 
determinant  1.  They  are  called  the  special  linear  groups. 


Example  16.4  An  n  x  n  matrix  U  E  Mn(C)  is  said  to  be  unitary  if 
U*U  =  UU*  =  I.  A  matrix  U  is  unitary  if  and  only  if 


(Uv,  Uw)  =  (v,  w) 

for  all  v,w  E  Cn.  The  group  of  unitary  matrices  is  denoted  U(n)  and  called 
the  (n  x  n)  unitary  group.  The  special  unitary  group,  denoted  SU (n), 
is  the  subgroup  of  U(n)  consisting  of  unitary  matrices  with  determinant  1. 


The  condition  ( U*U)jk  =  djk  is  equivalent  to  the  condition  that  the 
columns  of  U  form  an  orthonormal  set  in  Cn,  as  can  be  seen  by  direct 
computation.  Geometrically,  the  condition  £7*  £7  =  I  is  equivalent  to  the 
condition  that  (Uvi,  UV2)  =  (^1,^2)  for  ah  v\,V2  E  Cn,  i.e. ,  that  U  pre¬ 
serves  the  inner  product  on  Cn.  By  taking  the  determinant  of  the  condition 
Z7*£7  =  7,  we  see  that  |det£7|  =  1  for  all  U  E  U(n). 

In  this,  the  finite-dimensional  case,  the  condition  £7*Z7  =  I  implies  that 
£7*  is  the  inverse  of  U  and  thus  that  £7 £7*  =  I.  This  result  does  not  hold 
in  the  infinite-dimensional  case. 


Example  16.5  An  n  x  n  real  matrix  R  E  Mn(R)  is  said  to  be  orthogonal 
if  Rtr  R  =  RRtr  =  I .  A  matrix  R  is  orthogonal  if  and  only  if 

(Rv,  Rw)  =  (v,  w) 

for  all  v,w  E  Mn.  The  group  of  orthogonal  matrices  is  denoted  0 (n)  and 
is  called  the  (n  x  n)  orthogonal  group.  The  special  orthogonal  group, 

denoted  SO (n),  is  the  subgroup  of  0 (n)  consisting  of  orthogonal  matrices 
with  determinant  1. 

As  in  the  unitary  case,  the  condition  Rtr R  =  I  implies  that  RRtr  =  I 
and  that  the  columns  of  R  form  an  orthonormal  set  in  Mn.  Geometrically, 
a  real  matrix  R  is  in  0 (n)  if  and  only  if  (Rvi,Rv2)  =  (^1,^2)  f°r  Al 
V\,V2  6  Rn,  i.e.,  if  and  only  if  R  preserves  the  inner  product  on  Mn.  By 
taking  the  determinant  of  the  condition  RtrR  =  I  we  see  that  detl?  =  ±1 
for  all  R  E  0(n). 

It  is  easy  to  verify  that  all  the  groups  in  Examples  16.3,  16.4,  and  16.5 
are,  indeed,  subgroups  of  GL(n,C)  and  that  they  are  closed. 


16.2  Matrix  Lie  Groups  337 


Definition  16.6  A  matrix  Lie  group  G  is  connected  if  for  all  A,B  G  G 
there  is  a  continuous  path  A  :  [0, 1]  —>  Mn(C)  such  that  A(0)  =  A  and 
A{  1)  =  B  and  such  that  A(t)  lies  in  G  for  all  t.  A  matrix  Lie  group  G  is 
simply  connected  if  it  is  connected  and  every  continuous  loop  in  G  can 
be  shrunk  continuously  to  a  point  in  G.  A  matrix  Lie  group  G  is  compact 
if  it  is  compact  as  a  subset  of  Mn(C)  =  M2n  . 

By  the  Heine-Borel  theorem  (e.g.,  Proposition  0.26  of  [12]),  a  matrix 
Lie  group  G  is  compact  if  and  only  if  it  is  a  closed  and  bounded  subset 
of  Mn( C).  The  condition  we  are  calling  “connected”  is,  more  properly,  the 
condition  of  being  path  connected.  We  will  see,  however,  that  each  matrix 
Lie  group  is  an  embedded  real  submanifold  of  Mn(C)  and  is,  therefore, 
locally  path  connected.  For  matrix  Lie  groups,  then,  connectedness  and 
path  connectedness  are  equivalent. 

To  prove  that  a  matrix  Lie  group  G  is  connected,  it  suffices  to  prove  that 
for  all  A  G  G,  there  is  a  continuous  path  in  G  connecting  A  to  I .  After  all, 
if  both  A  and  B  can  be  connected  to  /,  then  they  can  be  connected  to  each 
other. 


Example  16.7  The  groups  0(n),  SO(n),  U(n),  and  SU(n)  are  compact. 


Proof.  The  conditions  defining  these  groups  are  obtained  by  setting  certain 
continuous  functions  equal  to  a  constant.  The  group  SU(n),  for  example,  is 
defined  by  setting  ( U*U)jk  =  Sjk  for  each  j  and  k  and  by  setting  det  U  —  1. 
These  groups  are  thus  closed  not  just  as  subsets  of  GL(n;C)  but  also  as 
subsets  of  Mn( C).  Furthermore,  each  of  these  groups  has  the  property  that 
each  column  of  any  matrix  in  the  group  is  a  unit  vector.  Thus,  each  group 
is  a  bounded  subset  of  Mn(C).  ■ 


Example  16.8  The  group  U(n)  is  connected. 


Proof.  If  U  G  Mn(C)  is  unitary,  then  U  has  an  orthonormal  basis  of 
eigenvectors  with  eigenvalues  of  absolute  value  1.  Thus,  there  is  another 
unitary  matrix  V  (the  change  of  basis  matrix)  such  that 


U  =  V 


\ 


V 


-1 


1 


/ 


for  some  real  numbers  $i,  02?  •  •  • ,  0n.  Thus,  we  can  define  a  family  U (t)  of 
unitary  matrices  by  setting 


/  eit0 1 


U(t)  =  V 


eit0  2 


\ 


/ 
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Then  U(-)  is  a  continuous  path  lying  in  U(n)  with  U( 0)  =  I  and  U(  1)  =  U. 


Example  16.9  The  group  SU(2)  is  simply  connected. 


Proof.  We  claim  that 


SU(2) 


a  —/3 

f3  a 


cq  f3  G  C, 


a 


+  P 


It  is  easy  to  see  that  each  matrix  of  the  indicated  form  is  indeed  unitary  and 
has  determinant  1.  On  the  other  hand,  if  U  is  any  element  of  SU(2),  then 
the  first  column  of  U  is  a  unit  vector  (a,/?)  G  C2.  The  second  column  of 
U  must  then  be  orthogonal  to  (a, /3).  Since  (— /?,  a)  is  orthogonal  to  (a,/?) 
and  C2  is  2-dimensional,  the  second  column  of  U  must  be  a  multiple  of 
(— /?,  a).  But  the  only  multiple  that  produces  a  matrix  with  determinant 
1  is  1. 

We  see,  then,  that  SU(2)  is,  topologically,  the  unit  sphere  S 3  inside  C2  = 
M4  and  is,  therefore,  simply  connected.  ■ 


16.3  Lie  Algebras 


We  now  introduce  the  general  algebraic  concept  of  a  Lie  algebra.  Once  this 
is  done,  we  will  show  how  to  associate  a  real  Lie  algebra  with  an  arbitrary 
matrix  Lie  group. 


Definition  16.10  A  Lie  algebra  over  a  field  F  is  a  vector  space  ^  over 


F,  together  with  a  “ bracket v  map  [•, 
properties: 


g  x  g  g  g  having  the  following 


1. 


is  bilinear 


2.  [y,  X]  =  -  [X,  Y]  for  all  X,  Y  G  g 

3.  [X,  X]  =  0  for  all  X  eg 


4-  For  all  X,  Y,  Z  G  g  we  have  the  Jacobi  identity 


[X,  [Y,  Z]]  +  [Y,  [Z,  X]]  +  [Z,  { X ,  Y]]  =  0. 


If  the  characteristic  of  F  is  not  equal  to  2,  then  Property  3  is  a  conse¬ 
quence  of  Property  2.  If  F  =  R,  then  we  say  that  g  is  a  real  Lie  algebra.  An 
example  of  a  real  Lie  algebra  is  the  vector  space  M3  with  the  bracket  equal 
to  the  cross  product.  Properties  1,  2,  and  3  are  evident  from  the  definition 
of  the  cross  product,  while  the  Jacobi  identity  is  a  known  property  of  the 
cross  product  that  can  be  verified  by  direct  calculation. 

A  large  class  of  Lie  algebras  may  be  obtained  by  the  following  procedure. 
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Example  16.11  Let  A  be  an  associative  algebra  and  let  g  be  a  subspace  of 
A  with  the  property  that  for  all  x,y  mg,  xy  —  yx  is  again  in  g.  Then  the 
bracket 

x ,  y\  :=  xy  —  yx 

makes  g  into  a  Lie  algebra. 

In  Example  16.11,  we  may  take,  for  example,  g  =  A.  It  is  evident  that 
this  bracket  satisfies  Properties  1,  2,  and  3  of  a  Lie  algebra,  and  the  Ja¬ 
cobi  identity  is  easily  verified  by  direct  calculation.  As  it  turns  out,  every 
Lie  algebra  is  isomorphic  to  a  Lie  algebra  of  this  type.  (This  claim  is  a 
consequence  of  the  Poincare-Birkhoff-Witt  theorem,  which  is  proved,  for 
example,  in  Sect.  5.2  of  [25].  The  algebra  A  in  the  Poincare-Birkhoff-Witt 
theorem  is  the  so-called  universal  enveloping  algebra  of  g.) 

Definition  16.12  If  gi  and  g2  are  Lie  algebras ,  a  map  (f>  :  gi  g2  is 
called  a  Lie  algebra  homomorphism  if  is  linear  and  <f>  satisfies 

<K[X,Y])  =  [<KX),<I>(Y)] 

for  all  X,Y  E  gi.  A  Lie  algebra  homomorphism  is  called  a  Lie  algebra 
isomorphism  if  it  is  one-to-one  and  onto. 

Definition  16.13  If  g  is  a  Lie  algebra,  a  subalgebra  of  g  is  a  subspace  \) 
of  g  with  the  property  that  [X,  Y]  E  \)  for  all  X  and  Y  in  1).  An  ideal  in  g 
is  a  subalgebra  of  g  with  the  stronger  property  that  [X,  Y]  E  \)  for  all  X 
in  g  and  Y  in  \). 

The  notion  of  a  subalgebra  of  a  Lie  algebra  is  analogous  to  the  notion 
of  a  subgroup  of  a  group,  while  the  notion  of  an  ideal  in  a  Lie  algebra  is 
analogous  to  the  notion  of  a  normal  subgroup  of  a  group.  In  particular, 
the  kernel  of  any  Lie  algebra  homomorphism  is  an  ideal,  just  as  the  kernel 
of  a  group  homomorphism  is  a  normal  subgroup. 

Definition  16.14  The  direct  sum  of  Lie  algebras  gi  and  g2,  denoted 
gi  ©  g2,  is  the  direct  sum  of  gi  and  g2  as  a  vector  space,  equipped  with  the 
bracket  given  by 


[(X1,Y1),(X2,Y2)}  =  ([Xi.Xal.tYi.ya]) 


for  all  Xi,X2  e  fli  and  Y\,Y2  G  02- 


16.4  The  Matrix  Exponential 

In  the  next  section,  we  will  associate  a  Lie  algebra  with  each  matrix  Lie 
group.  To  describe  this  association,  we  need  the  notion  of  the  exponential 
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of  a  matrix.  Given  a  matrix  X  £  Mn(C),  we  define  the  matrix  exponential 
of  X,  denoted  by  ex  or  exp(X),  by  the  usual  power  series, 


e 


x 


E 


m— 0 


ml  ’ 


where  X°  =  I  (the  identity  matrix).  This  series  converges  absolutely  for 
all  X  £  Mn(C),  as  can  easily  be  seen  using  the  inequality  ||Xm||  <  ||X||m  , 
where  ||X||  is  the  operator  norm  of  X ;  see  Definition  A. 35.  (In  this,  the 
finite-dimensional  case,  we  could  just  as  well  use  the  Hilbert-Schmidt  norm, 
which  amounts  to  using  the  usual  Euclidean  norm  on  Mn(C)  =  Cn  .  See 
Exercise  3.)  The  matrix  exponential  shares  some  but  not  all  of  the  proper¬ 
ties  of  the  exponential  of  a  number. 


Theorem  16.15  The  matrix  exponential  has  the  following  properties  for 
all  X,Y  £  Mn(C). 


1.  e°  =  I 

2.  ex±r  =  ( ex)tr  and  ex*  =  (ex)* 

3.  If  A  is  an  invertible  n  x  n  matrix,  then 

eAXA_1  =AexA-1. 


4.  det(ex)  =  etrace(x) 

5.  If  XY  =  YX  then  ex+y  =  exey 

6.  ex  is  invertible  and  (ex)-1  =  e~x 

7.  Even  if  XY  ^  YX,  we  have 

eXXY=  lim  (ex/mey/rn 

m^oo  \ 


Here  Xtr  and  X*  denote  the  transpose  and  adjoint  (conjugate  transpose) 
of  X,  respectively.  Property  7  is  known  as  the  Lie  Product  Formula  and  is 
a  special  case  of  the  Trotter  Product  formula  (Theorem  20.1).  Properties 
1,2,  and  3  are  easily  verified  using  term-by-term  computation.  Property  6 
follows  from  Property  5  by  taking  Y  =  —X  and  applying  Property  1.  The 
proofs  of  Properties  4,  5,  and  7  are  outlined  in  Exercises  5,  6,  and  7. 

Suppose  a  matrix  X  is  diagonalizable,  meaning  that 


( Al 

X  =  A 

V  o 


0  \ 

A-1 

^ n  ) 
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for  some  invertible  matrix  A  and  complex  numbers  Ai,  A2, . . . ,  An.  Then 
using  Property  3  of  Theorem  16.15,  it  is  easy  to  see  that 


e 


x 


0 


\ 


/ 


If  X  is  not  diagonalizable,  ex  can  be  computed  in  terms  of  the  SN  decom¬ 
position  of  X.  See  Sect.  2.2  of  [21]  for  details. 


Example  16.16  If 


X 


0  a  \ 
—a  0  J 


then 


cos  a 
—  sin  a 


sin  a 
cos  a 


Proof.  The  eigenvalues  of  X  are  ±ia  and  the  corresponding  eigenvectors 
are  (1,  =bi).  Thus,  we  may  calculate  that 


x=(  1  1  \  (  eia  0  \J_(  -i  -1\ 

V  i  -i  J  \  0  e~ia  J  (—2 i)  \  -i  1  J 

1  /  —i(e%a  +  e_m)  —  eia  +  e~ia  \ 

2 i  y  eia  —  e~ia  —i[eia  +  e~ia )  J  ’ 

which  simplifies  to  the  desired  result.  ■ 

The  relation  ex+Y  =  exeY  certainly  does  not  hold  for  general  (noncom¬ 
muting)  matrices  X  and  Y.  Nevertheless,  for  any  X  E  Mn(C)  we  have 


e{s+t)X  =  esXetX 


for  all  s  and  t  in  R,  since  sX  commutes  with  tX.  Thus,  for  each  X,  the  set 
of  matrices  of  the  form  etx,  t  E  R,  forms  a  subgroup  of  GL(n ;  C).  It  is  not 
hard  to  show  (Exercise  4),  using  term-by-term  differentiation,  that 


d 


tx 


=  X. 

t= o 


(16.1) 


Here,  the  derivative  of  a  matrix- valued  function  is  defined  as  being  entry - 
wise.  [That  is,  if  f(t)  is  a  matrix- valued  function,  df  / dt  is  the  matrix-valued 
function  whose  (j,  k)  entry  is  d(f(t)jk)/dt.] 

Definition  16.17  A  one-parameter  subgroup  o/GL(n;C)  is  a  continu¬ 
ous  homomorphism  of  R  into  GL (n;  C),  that  is,  a  continuous  map  A  :  R  — >> 
GL(n;  C)  such  that  M(0)  =  I  and  A(s  + 1)  =  A(s)A(t)  for  all  s,  t  E  R. 
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Theorem  16.18  If  A(-)  is  a  one-parameter  subgroup  o/GL(n;C),  there 
exists  a  unique  X  E  Mn(C)  such  that 

A(t)  =  etx 


for  all  t  E  R. 

This  is  Theorem  2.13  in  [21]. 


16.5  The  Lie  Algebra  of  a  Matrix  Lie  Group 


We  now  associate  a  Lie  algebra  g  to  each  matrix  Lie  group  G. 

Definition  16.19  If  G  C  GL(n;  C)  is  a  matrix  Lie  group,  then  the  Lie 
algebra  $  of  G  is  defined  as  follows: 

g  =  {X  E  Mn(C)  \  etx  E  G  for  all  t  E  M}  . 

That  is  to  say,  X  belongs  to  g  if  and  only  if  the  one-parameter  subgroup 
generated  by  X  lies  entirely  in  G.  Note  that  to  have  X  belong  to  g,  we 
need  only  have  etx  belong  to  G  for  all  real  numbers  t. 

Proposition  16.20  For  any  matrix  Lie  group  G ,  the  Lie  algebra  $  of  G 
has  the  following  properties. 


1.  The  zero  matrix  0  belongs  to 

2.  For  all  X  in  g,  tX  belongs  to  $  for  all  real  numbers  t. 

3.  For  all  X  and  Y  in  X  +  Y  belongs  to 

4 .  For  all  A  E  G  and  X  E  g  we  have  AXA~x  E 

5.  For  all  X  and  Y  in  g,  the  commutator  [X,  Y]  :=  XY  —  YX  belongs 
to  q. 


The  first  three  properties  of  £j  say  that  £j  is  a  real  vector  space.  Since 
Mn(C)  is  an  associative  algebra  under  the  operation  of  matrix  multipli¬ 
cation,  the  last  property  of  g  shows  that  $  is  a  real  Lie  algebra  (Exam¬ 
ple  16.11). 

Proof.  Points  1  and  2  are  elementary,  and  Point  3  follows  from  the  Lie 
product  formula,  using  the  assumption  that  G  is  closed.  Point  4  follows 
from  Property  3  in  Theorem  16.15.  To  verify  Point  5,  we  observe  that  the 
commutator  [X,  Y]  may  be  computed  as 


[X,Y] 


Ye 


-tX 


t=0 


1 
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using  (4)  and  an  easily  verified  product  rule  for  differentiation  of  matrix¬ 
valued  functions.  For  X,Y  G  0,  etxYe~tx  belongs  to  g  for  all  t  G  R,  by 
Point  4.  Furthermore,  we  have  already  shown  that  g  is  a  real  subspace  of 
Mn(C)  and  therefore  a  closed  subset  of  Mn(C).  Thus, 


[x,  Y]  =  lim 

h — ^0 


Ye~hX 


h 


Y 


belongs  to  g.  m 

Example  16.21  Let  gl (n;  C),  gl (n;  R),  sl(n;  C),  and  sl(n;  R)  denote  the  Lie 
algebras  o/GL(n;C),  GL(n;M),  SL(n;C),  and  SL(n;  R),  respectively .  Then 
we  have 


gl(n;C)  =Mn(C) 
gl(n;  R)  =  Mn( R) 

sl(n;  C)  =  {X  G  Mn(C)  |trace(X)  =  0} 
si (n;  R)  =  {X  G  Mn(R)  |trace(X)  =  0}  . 

Proof.  Let  us  consider,  for  example,  the  case  of  sl(n;  C).  By  Property  4  of 
Theorem  16.15,  if  trace(X)  =  0,  then 

det(etx)  =  ettrace m  =  e°  =  1, 

so  that  etx  G  SL(n;C).  In  the  other  direction,  if  X  G  sl(n;C),  then  by 
the  above  calculation,  we  must  have  ettrace(x )  =  0  for  all  t  G  R,  which  is 
possible  only  if  trace(X)  =  0.  The  proofs  of  the  other  cases  are  similar  and 
are  omitted.  ■ 

Example  16.22  The  Lie  algebras  u (n)  and  su(n)  of  \J(n)  and  SU(n)  are 
given  by 


u (n)  =  {X  E  Mn(C)  \X*  =  -X} 
su (n)  =  {X  G  u(n)  |trace(X)  =  0}  . 


The  Lie  algebra  so (n)  of  SO (n)  is  given  by 


so 


(n)  =  {X  G  Mn( R)  \Xtr  =  -X  }  . 


Finally,  the  Lie  algebra  of  0(n)  is  equal  to  so (n). 

Proof.  If  X*  =  —X,  then  by  Property  2  of  Theorem  16.15, 


tX\ *  _ 

L 


(e  ) 


-tx 


showing  that  etA  is  unitary.  In  the  other  direction,  if  etx  is  unitary  for  all 
t  G  R,  then  (etA)*  =  (etx)~1  =  e~tx .  Thus,  etx  =  e~tx .  Differentiating 
this  relation  at  t  =  0,  using  (16.1),  gives  X*  =  —X.  Thus,  the  Lie  algebra  of 
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U (n)  consists  exactly  of  the  matrices  with  the  property  that  X*  =  — X.  For 
the  Lie  algebra  of  SU(n),  we  add  the  trace-zero  condition,  as  in  the  proof 
of  Example  16.21.  The  calculations  for  SO (n)  are  similar  and  are  omitted. 
Note  that  if  X  £  Mn(M)  satisfies  Xtr  =  —X,  then  the  diagonal  entries  of  X 
are  zero  and,  thus,  trace(X)  is  automatically  0.  This  observation  explains 
why  the  Lie  algebras  of  0 (n)  and  SO(n)  are  the  same.  ■ 

Specializing  Proposition  16.22  the  case  n  =  3  gives 


so(3)  = 


0 

a 

-b 


a,  6,  c  G  R 


We  can  use  the  following  basis  for  so(3): 


0  0 
0  0 
0  1 


0  0  1  \ 
0  0  0 
-10  0/ 


0 

1 

0 


-1  0  \ 

0  0 

0  0/ 

(16.2) 


Direct  calculation  establishes  the  following  commutation  relations  for  the 

O’  s: 

[Fi,  F2]  =  Fs 
[F2,  Fs]  =  Fi 

[F3,F1]=F2.  (16.3) 

More  concisely,  we  have  [Fi,F2\  =  F3,  together  with  relations  obtained 
from  this  one  by  cyclic  permutation  of  the  indices.  Note  that  all  remaining 
commutation  relations  follow  from  (16.3)  by  means  of  the  skew-symmetry 
of  the  bracket;  we  have,  for  example,  [F2,F\]  =  — F3  and  [Fi,Fi\  =  0. 


16.6  Relationships  Between  Lie  Groups  and  Lie 
Algebras 

In  this  section,  we  explore  the  relationships  between  matrix  Lie  groups  and 
their  Lie  algebras.  In  particular,  we  investigate  the  question  of  the  extent 
to  which  a  matrix  Lie  group  is  determined  (up  to  isomorphism)  by  its  Lie 
algebra.  We  begin  by  showing  that  every  Lie  group  homomorphism  gives 
rise  to  a  Lie  algebra  homomorphism  in  a  natural  way. 

Theorem  16.23  Suppose  G\  and  G2  are  matrix  Lie  groups  with  Lie  al¬ 
gebras  and  02,  respectively,  and  suppose  4>  :  G\  G2  is  a  Lie  group 
homomorphism.  Then  there  exists  a  unique  linear  map  </>  :  0 1  ^  02  such 
that 

(£>(etx)  =  e^(x) 
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for  all  t  £  M  and  X  E  27us  linear  map  has  the  following  additional 
properties: 


1.  F])  =  0(F)]  /or  all  X,Y  G  g 

2.  </( AX  A -1)  =  $(^)0(X)$(^)"1  for  all  A  e  G  and  X  e  g 

3.  4>{X)  may  be  computed  as 


f(X)  = 


t= 0 


Point  1  shows  that  0  is  a  Lie  algebra  homomorphism.  Part  of  the  assertion 
of  Point  3  of  the  theorem  is  that  <£>(etx)  is  a  smooth  function  of  t  for  each  X. 

To  construct  </>,  note  that  since  4  is  a  continuous  homomorphism,  the 
map  t  i— >>  <f>(etA)  is  a  one-parameter  subgroup.  By  Theorem  16.18,  there 
exists  a  unique  Y  such  that  <f>(etx)  =  etY  for  all  t  E  R.  We  then  set 
<p(X)  =  Y.  An  argument  similar  to  the  proof  of  Proposition  16.20  then 
establishes  the  desired  properties  of  <f>.  See  the  proof  of  Theorem  2.21  in 
[21]  for  the  details. 

Corollary  16.24  Suppose  that  G\  and  G 2  are  matrix  Lie  groups  with  Lie 
algebras  qi  and  $2?  respectively.  If  G\  is  isomorphic  to  G 2,  then  $i  is  iso¬ 
morphic  to  02- 

Proof.  See  Exercise  11.  ■ 

Our  next  task  is  to  show  that  for  any  matrix  Lie  group  G,  the  Lie  algebra 
g  of  G  is  large  enough  to  capture  what  is  happening  in  a  neighborhood  of 
the  identity  in  G.  This  will  show,  for  example,  that  for  connected  matrix 
Lie  groups,  a  Lie  group  homomorphism  is  determined  by  the  corresponding 
Lie  algebra  homomorphism. 

Theorem  16.25  Let  G  be  a  matrix  Lie  group  with  Lie  algebra  $.  Then 
there  exists  a  neighborhood  U  of  0  in  Mn(C)  and  a  neighborhood  V  of  I  in 
Mn(C)  such  that  the  matrix  exponential  maps  U  diffeomorphically  onto  V 
and  such  that  for  all  X  E  C7,  we  have  that  X  belongs  to  $  if  and  only  if  ex 
belongs  to  G. 

See  Theorem  2.27  in  [21].  This  result  has  a  number  of  important  conse¬ 
quences. 

Corollary  16.26  Every  matrix  Lie  group  G  C  GL(n;  C)  is  a  real  embedded 
submanifold  of  Mn(C)  with  the  dimension  of  G  equal  to  the  dimension  of 
$  as  a  real  vector  space. 

The  claim  means,  more  precisely,  that  for  each  A  E  G,  there  exists  a 
neighborhood  U  of  A  and  a  diffeomorphism  of  U  with  a  neighborhood 
V  of  0  in  M2n  such  that  T(/7  flG)  =  V  D  where  d  =  dimg.  That  is  to 
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say,  after  a  change  of  coordinates,  G  “looks”  locally  like  a  little  piece  of 
sitting  inside  Mn(C)  =  M2™2. 

Proof.  We  use  exponential  coordinates  in  the  neighborhood  V  of  I  in 
Mn(C),  meaning  that  we  write  each  element  A  of  V  as  A  =  ex ,  with 
IgL  Theorem  16.25  says  that  near  the  identity,  in  these  coordinates,  G 
“looks  like”  the  real  vector  space  g  inside  Mn(C).  Given  any  other  point 

4  G  G,  we  can  use  left  multiplication  by  A-1  to  move  the  action  to  the 
identity  (Exercise  IT),  with  the  result  that  G  looks  like  £j  C  Mn(C)  near  A. 
Thus,  G  is  a  real  embedded  submanifold  of  dimension  d  =  dimjy  ■ 

Corollary  16.27  The  Lie  algebra  $  of  a  matrix  Lie  group  G  is  the  tangent 
space  to  G  at  I.  That  is  to  say,  $  coincides  with  the  set  of  those  X  in  Mn(C) 
for  which  there  exists  a  smooth  curve  7  :  R  -G  Mn(C)  lying  entirely  in  G 
and  such  that  7(0)  =  I  and  7/(0)  =  X. 

Proof.  If  X  G  £j,  then  X  is  the  derivative  of  e^x  at  t  —  0,  so  £(  is  contained 
in  the  tangent  space  at  I.  In  the  other  direction,  if  7  is  any  smooth  curve 
in  Mn(C)  that  lies  entirely  in  G  and  passes  through  I  at  t  =  0,  then  by 
Theorem  16.25,  we  can  express  7  as  7 (t)  =  e6^  (at  least  for  small  £),  where 

5  is  a  smooth  curve  in  £j  with  (5(0)  =  0.  It  is  then  easy  to  see  (Exercise  8) 

that  7;(0)  =  But  if  S  lies  in  £j,  then  (0) ,  which  equals  7;(0),  also  lies 

in  g,  as  in  the  proof  of  Proposition  16.20.  Thus,  the  tangent  space  at  I  is 
contained  in  g.  ■ 

Corollary  16.28  If  a  matrix  Lie  group  G  is  connected,  then  for  all  A  G  G 
there  exists  a  finite  sequence  X\,  X2, . . . ,  X A  of  elements  of  £j  such  that 

A  =  eXleX2  •  •  •  eXN . 

Proof.  If  G  is  connected  in  the  sense  of  Definition  16.6  (which  really  means 
that  G  is  path  connected),  then  G  is  certainly  connected  in  the  usual  topo¬ 
logical  sense  of  having  no  nontrivial  sets  that  are  both  open  and  closed. 
Let  U  denote  the  set  of  points  in  G  that  can  be  expressed  as  a  product 
of  exponentials  of  elements  of  g.  This  set  is  open  in  G  because  if  A  G  U 
and  B  G  G  is  close  to  A,  then  A~lB  is  close  to  I  in  G ,  and  therefore 
A~lB  =  ex  for  some  X  G  Thus,  B  =  Aex ,  which  means  that  B  is  also 
a  product  of  exponentials.  In  the  other  direction,  if  B  G  G  is  in  the  closure 
of  U,  then  there  is  some  element  A  of  U  that  is  close  to  B.  We  then  have, 
again,  that  B  =  Aex  for  some  X  G  0,  which,  again,  means  that  B  G  U. 
Now,  G  is  connected  and  U  is  both  open  and  closed.  Since  U  is  nonempty 
(/  G  Z7),  we  have  U  =  G.  m 

Corollary  16.29  Suppose  that  G\  and  G2  are  matrix  Lie  groups  with 
Lie  algebras  #1  and  32,  respectively.  Suppose  that  <fq  :  G\  G2  and 
<f>2  •  G\  — >  G 2  ore  Lie  group  homomorphisms,  with  associated  Lie  algebra 
homomorphisms  0  1  and  02,  respectively.  If  G\  is  connected  and  <fi  =02, 
then  <fq  =  <f>2- 
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Proof.  The  result  follows  from  Corollary  16.28  and  the  condition  <f>j(ex)  = 
j  =  1,2.  ■ 

We  have  seen  that  a  homomorphism  of  matrix  Lie  groups  gives  rise  to  a 
homomorphism  of  the  associated  Lie  algebra,  and  (Corollary  16.29)  that  if 
the  domain  group  is  connected,  the  Lie  algebra  homomorphism  determines 
the  Lie  group  homomorphism.  A  more  difficult  question  is  whether  we  can 
go  in  the  opposite  direction,  from  a  Lie  algebra  homomorphism  to  a  Lie 
group  homomorphism.  That  is  to  say,  given  a  Lie  algebra  homomorphism 
between  the  Lie  algebras  of  two  matrix  Lie  groups,  does  there  exist  a  Lie 
group  homomorphism  related  in  the  usual  way  to  the  Lie  algebra  homomor¬ 
phism?  The  answer  turns  out  to  be  yes,  provided  that  the  domain  group 
G\  is  connected  and  simply  connected  (i.e.,  that  every  continuous  loop  in 
G\  can  be  shrunk  continuously  in  G\  to  a  point). 

Theorem  16.30  Suppose  that  G\  and  G 2  are  matrix  Lie  groups  with  Lie 
algebras  and  32,  respectively,  and  suppose  that  p  :  Qi  32  is  a  Lie 
algebra  homomorphism.  If  G\  is  connected  and  simply  connected,  then 
there  exists  a  unique  Lie  group  homomorphism  <f>  :  G\  G2  such  that  <f> 
and  p  are  related  as  in  Theorem  16.23. 

One  way  to  prove  this  deep  result  is  to  make  use  of  the  Baker- Campbell- 
H aus dor ff  formula.  (See,  e.g.,  Chap.  3  of  [21].)  This  formula  states  that  for 
all  sufficiently  small  X  and  Y  in  Mn( C)  we  have 

eXeY  =  eX+Y+±[X,Y]  +  ±[X\X,Y]\-f;i[Y,[X,Y]]+---' 

Here  •  •  •  denotes  terms  that  are  expressible  in  terms  of  repeated  commu¬ 
tators  involving  X  and  Y,  with  coefficients  that  are  “universal,”  that  is, 
independent  of  n  (the  size  of  the  matrices)  and  of  the  choice  of  X  and  Y  in 
Mn(C).  Given  a  Lie  algebra  homomorphism  p  :  32,  one  can  use  the 

Baker-Campbell-Hausdorff  formula  to  construct  a  “local  homomorphism,” 
mapping  a  neighborhood  of  the  identity  in  G\  into  G 2.  If  G\  is  connected 
and  simply  connected,  it  is  possible  to  extend  this  local  representation  to  a 
global  representation.  See  Sect.  3.6  of  [21]  for  the  details  of  this  construc¬ 
tion. 

Corollary  16.31  Suppose  that  G\  and  G2  are  matrix  Lie  groups  with  Lie 
algebras  and  32,  respectively.  If  G\  and  G 2  are  connected  and  simply 
connected  and  £q  is  isomorphic  to  $2,  then  G\  is  isomorphic  to  G2. 

Proof.  Suppose  p  :  Qi  $2  is  a  Lie  algebra  isomorphism.  Since  G\  is 
connected  and  simply  connected,  there  exists  a  Lie  group  homomorphism 
<f>  :  G\  — >  G2  related  in  the  usual  way  to  p.  Since  G2  is  connected  and 
simply  connected,  there  exists  a  Lie  group  homomorphism  T  :  G2  -A  G\ 
related  in  the  usual  way  to  p~x .  Consider  now  the  homomorphism  T  o  <f>  : 
G\  -4  G\. 
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By  the  composition  property  of  Lie  algebra  homomorphisms  (Exercise  10), 
the  Lie  algebra  homomorphism  associated  with  4/  o<L  is  </>-1  o0  =  I.  It  then 
follows  from  Corollary  16.29  that  4/o<I>  =  I.  A  similar  argument  shows  that 
$0^  =  7,  which  means  that  <f>  is  a  Lie  group  isomorphism.  ■ 

Corollary  16.31  does  not  hold  without  the  assumption  that  both  groups 
are  simply  connected,  as  the  following  important  example  shows. 


Example  16.32  The  Lie  algebras  su(2)  and  so(3)  are  isomorphic,  but  the 
groups  SU  (2)  and  SO (3)  are  not  isomorphic. 


Since  SU(2)  is  simply  connected  (Example  16.9),  SO (3)  must  fail  to  be 
simply  connected.  Indeed,  7Ti(SO(3))  =  Z/2,  as  can  be  seen  from  Exam¬ 
ple  16.34. 

Proof.  The  Lie  algebra  su(2)  of  SU(2)  is  the  space  of  2  x  2  skew- self- adjoint 
matrices  with  trace  zero.  Explicitly, 


su(2) 


ia  b  +  ic 
b  +  ic  —ia 


a,  b,  c  G  R 


We  may  consider  the  following  basis  for  su(2): 


£i 


1 

2 


i  0 
0  -i 


E, 


1 

2 


0  1 

-1  0 


E, 


1 

2 


0  i 
i  0 


.  (16.4) 


Direct  calculation  shows  that  [Ei,!^]  =  E3  and  relations  obtained  from 
this  by  cyclic  permutation  of  the  indices.  These  are  the  same  relations  as 
those  satisfied  by  the  basis  elements  Fj ,  j  =  1,2,3,  for  so(3)  in  (16.2) 
and  (16.3).  Thus,  there  is  a  Lie  algebra  isomorphism  <f  :  su(2)  so(3)  such 

that  (\)[Ej )  =  Fj,  j  =  1,2,  3. 

On  the  other  hand,  there  can  be  no  isomorphism  between  SU(2)  and 
SO (3),  since  SU(2)  has  a  nontrivial  center  (containing  at  least  I  and  —I), 
whereas  the  center  of  SO (3)  is  trivial  (Exercise  14).  ■ 


Definition  16.33  Suppose  G  is  a  connected  matrix  Lie  group  with  Lie 
algebra  g.  A  universal  cover  of  G  is  an  ordered  pair  ((5,  T)  consisting 
of  a  simply  connected  matrix  Lie  group  G  and  a  Lie  group  homomorphism 
<f>  :  G  -A  G  such  that  the  associated  Lie  algebra  homomorphism  $ 

is  an  isomorphism  of  the  Lie  algebra  $  of  G  with  £j.  The  map  4>  is  called 
the  covering  map  for  G. 


Although  each  Lie  group  has  a  universal  cover  that  is  again  a  Lie  group, 
the  universal  cover  of  a  matrix  Lie  group  may  not  be  isomorphic  to  any 
matrix  Lie  group.  [The  universal  cover  of  SL(2;  R),  e.g.,  is  not  a  matrix  Lie 
group.]  It  can  be  shown,  however,  that  if  a  matrix  Lie  group  G  is  compact, 
then  the  universal  cover  of  G  is  again  a  matrix  Lie  group  (not  necessarily 
compact). 

Suppose  G  is  any  simply  connected  Lie  group  with  a  Lie  algebra  g  that 
is  isomorphic  to  g.  The  choice  of  a  particular  isomorphism  </>  :  0  — 9  gives 
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rise,  by  Theorem  16.30,  to  a  Lie  group  homomorphism  <f>  :  G  -A  G,  so  that 
(G,  4>)  is  a  universal  cover  of  G. 

If  (G,  <f>)  is  a  universal  cover  of  G,  it  is  often  convenient  to  use  the 
isomorphism  <fi  to  identify  £j  with  g.  If  we  follow  this  convention,  we  may 
say  that  a  universal  cover  of  G  is  a  simply  connected  group  G  having  “the 
same”  Lie  algebra  as  G. 

If  (G i,Ti)  and  (G 2,^2)  are  two  universal  covers  of  a  given  matrix  Lie 
group  G,  then  there  is  a  unique  Lie  group  isomorphism  T  :  G\  -A  G2  such 
that  <f>2(T(A))  =  <fq (A)  for  all  A  E  Gi.  (This  result  follows  easily  from 
Corollary  16.31.)  In  light  of  this  uniqueness  result,  we  will  often  speak  of 
“the”  universal  cover  of  G. 

Example  16.34  Let  <f>  :  SU(2)  -A  SO(3)  be  the  unique  Lie  group  homo¬ 
morphism  for  which  the  associated  Lie  algebra  homomorphism  satisfies 
4>{Ej)  =  Fj,  j  =  1,  2,  3.  Then  ker  <f>  =  {/,  —  1}  and  (SU(2),  <f>)  is  a  universal 
cover  Of  SO  (3). 

Proof.  Since  E\  is  diagonal,  it  is  easy  to  see  that  e27vEl  =  —I  in  SU{2). 
On  the  other  hand,  by  a  trivial  extension  of  Example  16.16,  we  have 

/  1  0  0 

eaFl  =  0  cos  a  —  sin  a 

\  0  sin  a  cos  a 

for  all  a  E  R.  In  particular,  e27rFl  =  I.  Thus, 

T(-7)  =  T(e27ri?1)  =  e27vFl  =  I. 

This  shows  that  —I  belongs  to  the  kernel  of  T. 

Now,  since  is  injective,  <f>  is  injective  in  a  neighborhood  of  I.  After  all, 
given  distinct  elements  A  and  B  of  SU (2)  near  /,  Theorem  16.25  tells  us 
that  we  can  express  A  as  ex  and  B  as  ey,  with  X  and  Y  being  distinct 
small  elements  of  su( 2).  Then  (f(X)  and  <f>(Y)  are  distinct  small  elements 
of  so(3).  Applying  Theorem  16.25  again  tells  us  that  4>(A)  =  and 

*(B)  =  e^X)  are  distinct. 

We  see,  then,  that  ker<f>  is  a  discrete  normal  subgroup  of  SU(2).  But  a 
standard  exercise  (Exercise  1)  shows  that  a  discrete  normal  subgroup  of  a 
connected  group  is  automatically  central.  On  the  other  hand,  it  is  easily 
verified  (Exercise  2)  that  the  center  of  SU(2)  is  {/,  —  /},  so  kerT  cannot  be 
larger  than  {/,  —I}. 

To  show  that  T  maps  onto  SO (3),  we  first  verify  (Exercise  13)  that  each 
element  R  of  SO (3)  can  be  expressed  as  R  =  ex ,  with  X  E  so(3).  Since 
is  surjective  and  T(ex)  =  e^^x\  <f>  maps  onto  SO (3).  ■ 
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16.7  Finite-Dimensional  Representations  of  Lie 
Groups  and  Lie  Algebras 

A  representation  of  a  group  G  is  a  homomorphism  II  of  G  into  GL(R), 
the  group  of  invertible  linear  transformations  on  some  vector  space.  If  II 
is  injective  then  G  is  isomorphic  to  its  image  under  II;  thus,  II  serves  to 
“represent”  G  concretely  as  a  group  of  invertible  linear  transformations. 
(We  continue  to  use  the  term  “representation”  even  if  II  is  not  injective.) 
Similarly,  a  representation  of  a  Lie  algebra  £j  is  a  Lie  algebra  homomorphism 
of  g  into  gl(V'),  the  space  of  all  linear  transformations  of  V,  where  we  equip 
gl(R)  with  the  bracket  [X,  Y]  :=  XY  -  YX. 

Recall  that  an  action  of  a  group  G  on  a  set  X  is  a  map  from  G  x  X  to  X, 
denoted  (g,x)  ha  g-x  satisfying  e-x  =  x  for  all  x  G  X  and  g-(h-x)  =  W  ■X 
for  all  g,  h  E  G  and  x  G  X.  A  representation  II  of  G  on  some  vector  space 
V  gives  rise  to  a  linear  action  of  G  on  V,  given  by  g  •  v  =  U(g)v.  (A  linear 
action  is  an  action  for  which  the  map  v  ha  g  •  v  is  linear  for  each  g.)  Thus, 
we  may  use  g  •  v  as  an  alternative  notation  to  II (g)v,  when  convenient. 

16.7.1  Finite- Dimensional  Representations 

If  G  is  a  matrix  Lie  group,  then  G  is  already  represented  as  a  group  of 
matrices.  Nevertheless,  it  is  of  interest  [as  we  will  see  in  Chap.  IT  in  the 
case  G  =  SO (3)]  to  explore  other  representations  of  G.  Since  a  matrix  Lie 
group  has  a  topological  structure  (inherited  from  Mn(C)),  it  is  natural  to 
require  representations  to  be  continuous.  It  is  also  simpler  to  deal  at  first 
with  finite- dimensional  representations,  that  is,  those  where  the  vector 
space  in  question  is  finite  dimensional,  although  eventually  we  will  need  to 
consider  infinite-dimensional  representations  as  well.  This  discussion  leads 
to  the  following  definition. 

Definition  16.35  Let  G  C  GL(n;  C)  be  a  matrix  Lie  group.  A  finite¬ 
dimensional  representation  of  G  is  a  continuous  homomorphism  of  G 
into  cun  the  group  of  invertible  linear  transformations  of  a  finite¬ 
dimensional  vector  space  V. 

We  will  assume  that  all  of  our  vector  spaces  are  over  the  field  C,  even 
though  it  is  occasionally  of  interest  to  consider  also  representations  over  R. 
The  topology  on  GL(R)  is  defined  by  picking  a  basis,  and  thereby  identifying 
the  space  of  linear  maps  of  V  to  V  with  Mn(C).  We  then  use  the  subset 
topology  on  GL(R)  =  GL(n;C)  C  Mn(C).  This  topology  is  easily  seen  to 
be  independent  of  the  choice  of  basis. 

An  important  example  of  representations  in  quantum  theory  arises  from 
the  time-independent  Schrodinger  equation  in  Mn,  namely  the  equation 

A  /V 

Hfj  =  Eif,  for  a  fixed  constant  E  E  R.  If  H  is  invariant  under  rotations, 
then  the  space  of  solutions  to  this  equation  is  invariant  under  rotations. 
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Note  that  an  individual  solution  fj  to  this  equation  may  or  may  not  be  a 

/\ 

rotationally  invariant  (i.e.,  radial)  function.  But  if  H  is  rotationally  invari- 
ant,  then  rotating  a  solution  to  Hip  =  Eip  will  give  another  solution  of  this 
equation.  Even  if  the  quantum  Hilbert  space  is  infinite  dimensional,  the 
solution  spaces  to  Hip  =  Eip  are  typically  finite  dimensional  and  consti¬ 
tute  finite  dimensional  representations  of  the  group  SO (n)  of  rotations.  If 
we  can  understand  what  all  possible  finite-dimensional  representations  of 
SO (n)  look  like,  we  will  have  made  a  lot  of  progress  in  understanding  solu¬ 
tions  to  Hip  =  Eip  in  the  rotationally  invariant  case.  This  line  of  reasoning 
will  be  explored  in  detail  in  Chap.  18. 

We  may  consider  as  well  finite-dimensional  representations  of  Lie  alge¬ 
bras.  Assuming  our  Lie  algebra  g  is  finite  dimensional  (which  is  the  only 
case  we  will  consider  in  this  chapter),  there  is  no  need  to  impose  a  re¬ 
quirement  of  continuity,  since  a  linear  map  of  one  finite-dimensional  real 
or  complex  vector  space  to  another  is  automatically  continuous. 

Definition  16.36  A  finite- dimensional  representation  of  a  Lie  algebra 
$  is  a  Lie  algebra  homomorphism  of  $  into  gl(H),  the  space  of  all  linear 
transformations  ofV.  Here  gl(E)  is  considered  as  a  Lie  algebra  with  bracket 
given  by  [X,  Y]  =  XY  -  YX. 

We  typically  consider  Lie  algebras  defined  over  the  field  R,  since  the  Lie 
algebra  of  a  matrix  Lie  group  is  in  general  only  a  real  subspace  of  Mn(C). 
Nevertheless,  it  is  convenient  to  consider  vector  spaces  over  C.  If  £j  is  a 
real  Lie  algebra  and  V,  and  therefore  also  gl(R),  is  a  complex  vector  space, 
then  we  require  only  that  tt  :  £j  — >  gl(E)  be  real  linear,  which  is  the  only 
requirement  that  makes  sense. 

In  the  interest  of  simplifying  the  terminology,  we  will  sometimes  speak 
of  “a  representation  V  f  without  making  explicit  mention  of  the  homomor¬ 
phism  n  or  7 r. 

Definition  16.37  If  n  :  G  GL(R)  is  a  representation  of  a  matrix  Lie 
group  G,  then  a  subspace  W  of  V  is  called  an  invariant  subspace  if 
U(g)w  G  W  for  all  g  G  G  and  w  G  W.  Similarly ,  if  tt  :  $  g\(V)  is 

a  representation  of  a  Lie  algebra  then  a  subspace  W  of  V  is  called  an 
invariant  subspace  if  tt(X)w  G  W  for  all  X  G  $  and  w  G  W.  A  represen¬ 
tation  of  a  group  or  Lie  algebra  is  called  irreducible  if  the  only  invariant 
subspaces  are  W  =  V  and  W  =  {0}. 

Definition  16.38  If  (n,  V\)  and  (E,  V2)  are  representations  of  a  matrix 
Lie  group  G,  a  map  T  :  V\  -A  V2  is  called  an  intertwining  map  (or 
morphism)  if  $(n(p)v)  =  Y(g)Q(v)  for  all  v  G  Vi,  with  an  analogous 
definition  for  intertwining  maps  of  Lie  algebra  representations.  If  an  in¬ 
tertwining  map  is  an  invertible  linear  map,  it  is  called  an  isomorphism. 
Two  representations  are  said  to  be  isomorphic  (or  equivalent)  if  there 
exists  an  isomorphism  between  them. 
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In  the  “action”  notation,  the  requirement  on  an  intertwining  map  <f>  is 
that  <&(g  •  v)  =  g  •  meaning  that  <f>  commutes  with  the  action  of  G. 

A  typical  goal  of  representation  theory  is  to  classify  all  finite-dimensional 
irreducible  representations  of  G  up  to  isomorphism. 

Given  a  representation  II  :  G  GL(P)  of  a  matrix  Lie  group  G,  we 
can  identify  GL(R)  with  GL(7V;C)  and  g\(V)  with  gl(n;C)  by  picking  a 
basis  for  V.  We  may  then  apply  Theorem  16.23  to  obtain  a  representation 
7T  :  g  gl(V)  such  that 

U(ex)  = 

for  all  X  £  g. 

Proposition  16.39  Suppose  G  is  a  connected  matrix  Lie  group  with  Lie 
algebra  g.  Suppose  that  II  :  G  -A  GL(Vr)  is  a  finite- dimensional  representa¬ 
tion  of  G  and  tt  :  g  g\(V)  is  the  associated  Lie  algebra  representation. 
Then  a  subspace  W  ofV  is  invariant  under  the  action  of  G  if  and  only  if  it 
is  invariant  under  the  action  of  g.  In  particular ,  II  is  irreducible  if  and  only 
if  tt  is  irreducible.  Furthermore,  two  representations  of  G  are  isomorphic  if 
and  only  if  the  associated  Lie  algebra  representations  are  isomorphic. 


In  general,  given  an  representation  tt  of  g,  there  may  be  no  representation 
II  such  that  tt  and  II  are  related  in  the  usual  way.  If,  however,  G  is  simply 
connected,  Theorem  16.30  tells  us  that  there  is,  in  fact,  a  II  associated  with 
every  tt. 

Proof.  Suppose  W  C  b  is  invariant  under  tt(X)  for  all  X  £  g.  Then 
W  is  invariant  under  7r(X)m  for  all  m.  Since  V  is  finite  dimensional,  any 
subspace  of  it  is  automatically  a  closed  subset  and  thus  W  is  invariant 
under 


tt(X)171 

^  ml 

m— 0 


Since  G  is  connected,  every  element  of  G  is  (Corollary  16.28)  a  product 
of  exponentials  of  elements  of  g,  and  so  W  is  invariant  under  11(A)  for  all 
AgG. 

In  the  other  direction,  if  W  is  invariant  under  11(A)  for  all  A  £  G,  then 
since  W  is  closed,  it  is  invariant  under 


n(X)  =  lim 
h—±  0 


I 


for  all  X  £  g. 

Now  suppose  III  and  II2  are  two  representations  of  G,  acting  on  vector 
spaces  Vi  and  V2,  respectively.  If  <f>  :  V\  —>  V2  is  an  invertible  linear  map, 
then  an  argument  similar  to  the  above  shows  TIIi(A)  =  Il2(A)T  for  all 
A  £  G  if  and  only  if  <f>7Ti(X)  =  7r2(N)T  for  all  X  £  g.  Thus,  <f>  is  an 
isomorphism  of  group  representations  if  and  only  if  it  is  an  isomorphism  of 
Lie  algebra  representations.  ■ 
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Theorem  16.40  (Schur’s  Lemma)  IfV\  andV\ 2  are  two  irreducible  rep¬ 
resentations  of  a  group  or  Lie  algebra,  then  the  following  hold. 

1.  If  :  V\  V2  is  an  intertwining  map,  then  either  <f>  =  0  or  Q  is  an 
isomorphism. 

2.  If  :  Vi  -A  V2  and  T  :  V\  —>  V2  are  nonzero  intertwining  maps,  then 
there  exists  a  nonzero  constant  cGC  such  that  <f>  =  cT.  In  particular, 
if  <f>  is  an  intertwining  map  of  V\  to  itself  then  $  =  cl. 

Although  the  first  part  of  Schur’s  lemma  holds  for  representations  over 
an  arbitrary  field,  the  second  part  holds  only  for  representations  over  alge¬ 
braically  closed  fields. 

Proof.  It  is  easy  to  see  that  ker<f>  is  an  invariant  subspace  of  V\.  Since 
Vl  is  irreducible,  this  means  that  either  ker<F  =  Vi,  in  which  case  <f>  =  0, 
or  ker<f>  =  {0},  in  which  case  is  injective.  Similarly,  the  range  of  <f>  is 
invariant,  and  thus  equal  to  either  {0}  or  V2.  If  4>  is  not  zero,  then  the 
range  of  <f>  is  not  zero,  hence  all  of  V2.  Thus,  if  <f>  is  not  zero,  it  is  both 
injective  and  surjective,  establishing  Point  1. 

For  Point  2,  since  <F  and  T  are  nonzero,  they  are  isomorphisms,  by 
Point  1.  It  suffices  to  prove  that  F  :=  T_1T  is  a  multiple  of  the  iden¬ 
tity,  where  F  is  an  intertwining  map  of  V\  to  itself.  Since  we  are  work¬ 
ing  over  C,  T  must  have  at  least  one  eigenvalue  A.  If  W  denotes  the  A- 
eigenspace  of  T,  then  W  is  invariant  under  the  action  of  the  group  or  Lie 
algebra.  After  all,  if  Fw  =  A w,  then  (in  the  notation  of  the  group  case) 
T(U(A)  w)  =  H(A)Fw  =  ALE (A)w.  Since  A  is  an  eigenvector  of  T,  the  in¬ 
variant  subspace  W  is  nonzero  and  thus  W  =  Vi,  which  means  precisely 
that  T  =  XI.  m 

16.7.2  Unitary  Representations 

In  quantum  mechanics,  we  are  interested  not  only  in  vector  spaces,  but, 
more  specifically,  in  Hilbert  spaces,  since  expectation  values  are  defined  in 
terms  of  an  inner  product.  We  wish  to  consider,  then,  actions  of  a  group 
that  preserve  the  inner  product  as  well  as  the  linear  structure.  Although 
the  Hilbert  spaces  in  quantum  mechanics  are  generally  infinite  dimensional, 
we  restrict  our  attention  in  this  section  to  the  finite-dimensional  case. 

Definition  16.41  Suppose  V  is  a  finite- dimensional  Hilbert  space  over  C. 
Denote  by  U  ( W)  the  group  of  invertible  linear  transformations  ofV  that  pre¬ 
serve  the  inner  product.  A  (finite- dimensional)  unitary  representation 
of  a  matrix  Lie  group  G  is  a  continuous  homomorphism  ofU:G U(R), 
for  some  finite- dimensional  Hilbert  space  V. 

Proposition  16.42  Let  n  :  G  GL(H)  be  a  finite- dimensional  repre¬ 
sentation  of  a  connected  matrix  Lie  group  G,  and  let  1 r  be  the  associated 
representation  of  the  Lie  algebra  $  of  G.  Let  (•,  •)  be  an  inner  product  on  V. 
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Then  II  is  unitary  with  respect  to  (•,  •)  if  and  only  if  tt(X)  is  skew-self- 
adjoint  with  respect  to  (•,  •)  for  all  X  E  g,  that  is,  if  and  only  if 

7 T(X)*  =  -7T{X) 


for  all  X  G  g. 

In  a  slight  abuse  of  notation,  we  will  refer  to  a  representation  tt  of  a 
Lie  algebra  g  on  a  finite-dimensional  inner  product  space  as  unitary  if 
tt(X)*  =  -7 t(X)  for  all  X  eg. 

Proof.  Suppose  first  that  11(A)  is  unitary  for  all  A  e  G.  Then  for  all  X  e  g 
and  t  e  R  we  have 

U(etxy  =  n(e*y-1  =  n(e"tx)  =  e~t7l{x). 

On  the  other  hand, 

n(  etxy  =  {el<xe  =  e^(xy . 

Thus, 

eMxy  =  e-tn(X) 

for  all  t.  Differentiating  at  t  =  0  yields  7r(X)*  =  —tt(X). 

In  the  other  direction,  if  7r(X)*  =  —  n(X)  for  all  X  e  g ,  then 

n  (exy  =  e”W  =  e-”W  =  U(e~x)  =  n(ex)~\ 

meaning  that  II(ex)  is  unitary.  Since  G  is  connected,  Corollary  16.28  tells 
us  that  each  element  A  of  G  is  expressible  as  a  product  of  exponentials, 
from  which  it  follows  that  11(A)  is  unitary.  ■ 

16.7.3  Projective  Unitary  Representations 

In  quantum  mechanics,  two  unit  vectors  in  the  quantum  Hilbert  space  that 
differ  by  multiplication  by  a  constant  are  considered  to  represent  the  same 
physical  state.  Thus,  an  operator  of  the  form  el0I ,  with  6  G  R,  will  act  as  the 
identity  at  the  level  of  the  physical  states.  Suppose  that  V  is  a  Hilbert  space 
over  C,  assumed  for  the  moment  to  be  finite  dimensional.  Then  it  is  natural 
to  consider  homomorphisms  not  into  U(V’)  but  rather  into  the  quotient 
group  U  (V)/{el0I}.  Of  course,  given  a  homomorphism  n  of  G  into  U(R), 
we  can  always  turn  n  into  a  homomorphism  of  G  into  the  quotient  group, 
just  by  composing  n  with  the  quotient  map.  Not  every  homomorphism  into 
the  quotient  group,  however,  arises  from  a  homomorphism  into  U(R). 

Definition  16.43  Suppose  V  is  a  finite- dimensional  Hilbert  space  over  C. 
Then  the  projective  unitary  group  over  V,  denoted  PU(R),  is  the  quo¬ 
tient  group 

PU(V')  =  U  (V)/{eieI), 

where  {el9I}  denotes  the  group  of  matrices  of  the  form  e'V,  9  e  R. 
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Note  that  {e10 1}  is  a  closed  normal  subgroup  of  U(E).  Now,  U(E)  is 
(isomorphic  to)  a  matrix  Lie  group,  since  we  can  identify  it  with  U(n)  by 
picking  an  orthonormal  basis  for  V.  In  general,  the  quotient  of  a  matrix 
Lie  group  by  a  closed  normal  subgroup  may  not  be  a  matrix  Lie  group.  In 
this  case,  however,  it  is  not  hard  to  realize  the  quotient  U {n)  /  {e10 1}  as  a 
matrix  Lie  group. 


Proposition  16.44  IfV  is  a  finite- dimensional  Hilbert  space  over  C,  then 
PU(E)  is  isomorphic  to  a  matrix  Lie  group. 

Let  Q  :  U(R)  —>  PU(R)  be  the  quotient  homomorphism  and  let  q  : 
u(V)  pu(E)  be  the  associated  Lie  algebra  homomorphism.  Then  q  maps 
u(P)  onto  pu(E)  and  the  kernel  of  q  is  the  space  of  matrices  of  the  form 
ial  with  a  G  R.  Thus,  pu(E)  is  isomorphic  to  u  (V)  /  {ial}. 


The  Lie  algebra  u(V)  of  U (IN)  is  the  space  of  skew- self- adjoint  operators 
on  V.  In  Proposition  16.44,  the  space  {ial}  is  an  ideal  in  u(V)  and  the 
quotient  is  in  the  sense  of  Lie  algebras  over  R;  see  Exercise  9.  If  dim  V  =  N, 
then  it  is  not  hard  to  see  that  the  Lie  algebra  pu(E)  =  u  {V)/{iaI}  is 
isomorphic  to  the  Lie  algebra  su (TV).  The  group  PU(E)  is  not,  however, 
isomorphic  to  the  group  SU(7V).  See  Exercise  16. 

Proof.  If  dimG  =  TV,  then  gl(E),  the  space  of  all  linear  maps  of  V  to  V, 
has  dimension  N2 .  Given  U  G  U(E),  we  can  define 

Cu  :  g\(V)  g\(V) 


Cu(X)  =  UXU~\ 


(That  is  to  say,  Cu  is  conjugation  by  U .)  Note  that  ( Cu )_1  =  Cjj-i  and 
Cuv  =  CjjCy .  Thus,  C  (i.e.,  the  map  U  ^  Cu)  is  a  homomorphism  of 
U(P)  into  GL(gl(P)),  and  this  homomorphism  is  clearly  continuous.  If  U 
is  a  multiple  of  the  identity,  then  Cu  is  the  identity  operator  on  g\(V). 
Conversely,  if  Cu  is  the  identity,  then  UX  =  XU  for  all  X  G  gl(E),  which 
implies  (Exercise  18)  that  U  is  a  multiple  of  the  identity.  Thus,  the  kernel 
of  C  consists  precisely  of  those  scalar  multiples  of  the  identity  that  are  in 
U(E);  that  is,  kerC  =  {el0I}. 

We  have  constructed,  then,  a  homomorphism  of  U(E)  into  GL(gl(E))  = 
GL(7V2;C)  with  a  kernel  that  is  precisely  {el0I}.  The  image  of  U(P)  un¬ 
der  this  homomorphism  is,  therefore,  isomorphic  to  the  quotient  group 
U (V)/{e10 1}.  Furthermore,  since  U(E)  is  compact,  the  image  of  U(E)  un¬ 
der  C  is  compact  and  thus  closed.  This  image  is,  then,  a  matrix  Lie  group 
isomorphic  to  PU(V’). 
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Let  c  be  the  associated  Lie  algebra  homomorphism  associated  with  the 
homomorphism  C.  Using  Point  3  of  Theorem  16.23,  we  may  calculate  that 

cx(Y)  =  ^-etxYe~tx 

t= o 

=  XY  -YX 

=  [x,n 

Using  Exercise  18  again,  we  see  that  cx  =  0  if  and  only  if  X  is  a  multiple 
of  the  identity.  Thus,  the  kernel  of  c  consists  of  all  the  scalar  multiples  of 
I  in  u(U),  namely  {ial}. 

Now,  the  image  of  U(U)  under  C  is  (isomorphic  to)  PU(U);  in  particular, 
C  maps  U(U)  onto  PU(U).  It  follows  that  c  must  map  u(U)  onto  pu(U). 
(This  claim  follows  from  Theorem  3.15  in  [21].)  Thus,  pu(U)  =  u (V) /{ial}. 


Definition  16.45  A  finite- dimensional  projective  unitary  representa¬ 
tion  of  a  matrix  Lie  group  G  is  a  continuous  homomorphism  U  of  G  into 
PU(V),  where  V  is  a  finite- dimensional  Hilbert  space  over  C.  A  subspace 
W  ofV  is  said  to  be  invariant  under  II  if  for  each  A  e  G,  W  is  invariant 
under  JJ  for  every  U  E  U(V)  such  that  [U]  =  11(A).  A  projective  unitary 
representation  (II,  V)  is  irreducible  if  the  only  invariant  subspaces  are  {0} 
and  V. 

Given  an  ordinary  unitary  representation,  E  :  G  —>  U(U),  we  can  always 
form  a  projective  representation,  II  :  G  -A  PU(U),  simply  by  setting  II  = 
QoS.  Not  every  projective  representation,  however,  arises  in  this  fashion. 
Thus,  considering  projective  representations  gives  us  more  flexibility  than 
considering  ordinary  unitary  representations. 

Proposition  16.46  Let  II  :  G  PU(U)  be  a  finite- dimensional  projective 
unitary  representation  of  a  matrix  Lie  group  G,  and  let  tt  :  $  pu(U)  be 
the  associated  Lie  algebra  homomorphism.  Then  there  exists  a  Lie  algebra 
homomorphism  a  :  $  u(U)  such  that  i t(X)  =  q(a(X))  for  all  X  E  0. 
It  is  possible  to  choose  a  so  that  trace(cr(X))  =  0  for  all  X  E  g,  and  a  is 
unique  if  we  require  this  condition. 

That  is  to  say,  every  finite-dimensional  projective  representation  can  be 
“de-projectivized”  at  the  Lie  algebra  level.  In  general,  a  is  not  unique, 
because  there  may  be  cr’s  for  which  trace(cr(X))  is  nonzero  for  some  X. 
On  the  other  hand,  if  $  has  the  property  that  every  X  E  $  is  a  linear 
combination  of  commutators — which  is  true  if  $  =  so(3) — then  a  is  unique. 
See  Exercise  15. 

Proof.  Recall  that  pu(V)  =  u (V) / {ial}.  That  is,  for  each  X  E  0,  i t(X) 
denotes  a  whole  family  of  operator  that  differ  by  adding  ial.  If  Y  E  u(n) 
is  any  representative  of  7 t(X),  then  since  T*  =  —  T,  the  trace  of  Y  will 
be  pure  imaginary.  Thus,  there  is  a  unique  pure-imaginary  constant  c  = 
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— trace(y)/ dimF  such  that  the  trace  of  Y  +  cl  is  zero.  Let  us  then  set 
cr(X)  =  Y  +  cl.  Since  7r  is  a  Lie  algebra  homomorphism,  a([X,Y])  will 
equal  [cr(X),  cr(Y)]  +  ial,  for  some  a  E  R.  Since  trace  (a  ( [X,  Y]))  =  0  by 
construction  and  since  the  commutator  of  any  two  matrices  has  trace  zero, 
we  see  that  actually  a  =  0.  Thus,  a  a  as  in  the  proposition  exists,  and  it  is 
unique  if  we  require  that  cr(X)  have  trace  zero.  ■ 

Theorem  16.47  Suppose  G  is  a  matrix  Lie  group  and  G  is  a  universal 
cover  of  G,  with  covering  map  <F.  Then  the  following  hold. 

1.  Let  II  :  G  PU(V')  be  a  finite- dimensional  projective  unitary  rep¬ 
resentation  of  G.  Then  there  is  an  ordinary  unitary  representation 
E  :  G  U(F)  of  G  such  that  II  o  <f>  =  Q  o  E.  Any  such  E  is  irre¬ 
ducible  if  and  only  if  II  is  irreducible.  It  is  possible  to  choose  E  so 
that  det(E(A))  =  1  for  all  A  E  G,  and  E  is  unique  if  we  require  this 
condition. 

2.  Let  E  be  a  finite- dimensional  irreducible  unitary  representation  of  G. 
Then  the  kernel  of  the  associated  projective  unitary  representation 
QoE  contains  the  kernel  of  the  covering  map  <F.  Thus,  Q  o  E  factors 
through  G  and  gives  rise  to  a  projective  unitary  representation  of  G. 

In  the  finite-dimensional  case,  then,  there  is  a  one-to-one  correspondence 
between  irreducible  projective  unitary  representations  of  G  and  irreducible, 
determinant-one  ordinary  unitary  representations  of  G.  Point  1  of  the  the¬ 
orem  means  that  any  finite-dimensional  projective  unitary  representation 
of  the  group  G  can  be  “de-projectivized”  at  the  expense  of  passing  to  the 
universal  cover  G  of  G. 

Note  that  Theorem  16.47  applies  only  to  finite- dimensional  projective 
unitary  representations.  Example  16.56  will  provide  an  infinite-dimensional 
example  in  which  Point  1  of  the  theorem  fails. 

Proof.  If  g  is  the  Lie  algebra  of  G ,  Proposition  16.46  tells  us  that  we  can 
find  an  ordinary  representation  a  :  $  -A  u(V)  such  that  q  o  a  =  tt.  We  then 
define  a  representation  a  :  $  u(E)  of  the  Lie  algebra  $  of  G  by  setting 
<t(X)  =  a(cj)(X)),  X  G  fj.  Since  G  is  simply  connected,  we  can  then  find 
a  unique  representation  E  :  G  U(E)  such  that  E(ex)  =  for  all 

X  G  0.  Since 

qoa  =  qoao<f>  =  no(j)1 

it  follows  that  Q oE  =  IIo4>.  Furthermore,  if  E  maps  into  SU(E),  a  =  docj)~l 
maps  into  su(n).  This  condition  uniquely  determines  a  and  thus  also  a  and 
E,  establishing  Point  1  of  the  theorem. 

For  Point  2,  observe  that  ker  T  is  a  discrete  normal  subgroup  of  G,  which 
is  therefore  central  (Exercises  1  and  12).  Thus,  for  all  A  E  ker<f>,  we  have 


E(A)E(5)  =  E  (AB)  =  E  (BA)  =  E(B)E(A) 
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for  all  B  E  G.  That  is  to  say,  E(A)  is  an  intertwining  map  of  V  to  itself. 
Since  V  is  also  irreducible  as  a  representation  of  G,  Schur’s  lemma  tells  us 
that  E(A)  =  cl ,  where  \c\  =  1  because  E(A)  E  U(R).  Thus,  A  is  in  the 
kernel  of  the  associated  projective  representation  QoE.  ■ 


16.8  New  Representations  from  Old 


In  this  section,  we  consider  three  basic  mechanisms  for  combining  repre¬ 
sentations  to  produce  new  representations:  direct  sums,  tensor  products, 
and  duals.  This  section  assumes  familiarity  with  these  notions  at  the  level 
of  vector  spaces;  a  brief  review  is  provided  in  Appendix  A.l. 

Definition  16.48  Suppose  (IIi,Vi)  and  (LO,!^)  are  representations  of  a 
matrix  Lie  group  G.  The  direct  sum  of  these  two  representations  is  the 
representation  IIi  ®  II 2  :  G  GL(Vi  ©  V2)  given  by 

(iii  ®  n2)(A)  =  ni(A)  ©  n2(A). 

The  tensor  product  of  II 1  and  II 2  is  the  representation  II 1  ©  II 2  :  G 
GL(Vi  ©  V2)  given  by 

(ni  <8)  n2)(A)  =  ni  (A)®  n2(A). 

Finally ,  the  dual  of  Hi  is  the  representation  iTf  :  G  GL(R*)  given  by 

nf(A)  =  ni  (n-1)tr  =  (ni(n)tr)_1 . 

Similarly,  the  direct  sum,  tensor  product,  and  dual  of  Lie  algebra  repre¬ 
sentations  can  be  defined  by 


(7Ti  ©  7T2)(X)  =  7Ti(X)  0  7 T2{X) 

(7Ti  ©  7T2)(X)  =  7Ti(X)  ©  I  +  I  ©  7 T2{X) 
7r{r(X)  =  -n1(X)tr. 


It  is  important  to  note  the  differences  in  formulas  between  the  group  and 
the  Lie  algebra  in  the  case  of  tensor  products  and  dual  representations.  It 
is  easy  to  motivate  the  definitions  for  the  Lie  algebra:  If  G  acts  on  V\  ©  V2 
by  Iii  (A)  ©  n2(A),  then  the  associated  Lie  algebra  action  will  be  given  by 


d 

dt 


n1(e‘x)®n2(etx) 


7=0 


7T 1  (X )  ©  I  +  I  ©  7 r2(X). 


Of  course,  we  continue  to  use  this  last  formula  for  tensor  products  of  Lie 
algebra  representations,  even  if  there  is  no  associated  group  representations. 
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Remark  16.49  If  (IE,  Ei)  and  (n2,  V2)  are  representations  of  a  group  G, 
it  is  possible  to  view  V±  0  E2  as  a  representation  of  the  direct  product  group 
GxG,  by  setting 


(ir  0  n 2)(a,  b)  =  ni(A)  0  n2(s). 

Similarly ,  if  (7Ti,Vi)  and  (7t2,E2)  are  representations  of  a  Lie  algebra  g,  it 
is  possible  to  view  V\  0  E2  as  a  representation  of  g  ©  g  by  setting 

(7Ti  0  7 r2)(X,  y)  =  7Ti(X)  0  I  +  I  0  7 r2(y). 

Nevertheless,  it  is,  in  most  cases,  more  natural  to  view  Vi  0  V2  as  a 
representation  of  G  itself,  rather  than  of  G  x  G.  Even  if  Vi  and  V2  are 
irreducible  representations  of  G,  the  space  Vi  0  E2  will  in  most  cases  fail 
to  be  irreducible  as  a  representation  of  G.  If,  for  example,  we  take  Vi  = 
E2  =  V,  then  the  space  of  symmetric  tensors  inside  E  0  E  will  form  a 
nontrivial  invariant  subspace,  unless  dimE  =  1.  An  important  problem  in 
representation  theory  is  to  decompose  V\  0  E2  as  a  direct  sum  of  irreducible 
representations,  where  Vi  and  E2  are  irreducible  representations  of  a  fixed 
group  or  Lie  algebra.  In  the  case  of  the  Lie  algebra  su(2),  this  decomposition 
is  discussed  in  Sect.  17.9. 

Definition  16.50  A  finite- dimensional  representation  of  a  group  or  Lie 
algebra  is  said  to  be  completely  reducible  if  it  is  isomorphic  to  a  direct 
sum  of  irreducible  representations. 

Proposition  16.51  Every  finite- dimensional  unitary  representation  of  a 
group  or  Lie  algebra  is  completely  reducible. 

Proof.  Suppose  (II,  E)  is  a  unitary  representation  of  a  matrix  Lie  group  G. 
If  IE  is  a  subspace  of  E  invariant  under  each  11(A),  then  W1-  is  invariant 
under  each  11(A)*,  as  the  reader  may  easily  verify.  But  since  II  is  unitary, 

n(A)*  =  n(A)-1  =  n(A“1). 

Thus,  W1-  is  invariant  under  II(A_1)  for  all  A  E  G,  hence  under  11(A)  for  all 
A  G  G.  We  conclude  that,  in  the  unitary  case,  the  orthogonal  complement 
of  an  invariant  subspace  is  always  invariant. 

If  V  is  irreducible,  there  is  nothing  to  prove.  If  not,  we  pick  a  nontrivial 
invariant  subspace  W  and  decompose  V  as  IE  0  lE^.  The  restriction  of  II 
to  IE  or  to  W1-  is  again  a  unitary  representation,  so  we  can  repeat  this 
procedure  for  each  of  these  subspaces.  Since  V  is  finite  dimensional,  the 
process  must  eventually  terminate,  yielding  an  orthogonal  decomposition 
of  V  as  a  direct  sum  of  irreducible  invariant  subspaces. 

If  we  consider  a  unitary  representation  tt  of  a  Lie  algebra  0,  we  have 
the  same  argument,  but  with  the  identity  11(A)*  =  II (A-1)  replaced  by 
7 t(X)*  =  -7 t(X).  m 
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Proposition  16.52  Suppose  K  is  a  compact  matrix  Lie  group.  For  any 
finite- dimensional  representation  (II,  V)  of  K,  there  exists  an  inner  product 
on  V  such  that  11(A)  is  unitary  for  all  A  E  G.  In  particular ,  every  finite¬ 
dimensional  representation  of  K  is  completely  reducible. 

See  Proposition  4.36  in  [21]. 


16.9  Infinite-Dimensional  Unitary  Representations 

For  the  applications  we  have  in  mind,  we  need  to  consider  representa¬ 
tions  that  are  infinite  dimensional.  The  theory  of  such  representations  is 
inevitably  more  complicated  than  that  of  finite-dimensional  representa¬ 
tions.  For  our  purposes,  it  suffices  to  consider  the  nicest  sort  of  infinite¬ 
dimensional  representations — unitary  representations  in  a  Hilbert  space. 


16.9.1  Ordinary  Unitary  Representations 

We  begin  by  considering  ordinary  representations  and  then  turn  to  projec¬ 
tive  representations. 


Definition  16.53  Suppose  G  is  a  matrix  Lie  group.  Then  a  unitary  rep¬ 
resentation  of  G  is  a  strongly  continuous  homomorphism  n  :  G  U(H), 
where  H  is  a  separable  Hilbert  space  and  U(H)  is  the  group  of  unitary  op¬ 
erators  on  H.  Here ,  strong  continuity  of  n  means  that  if  a  sequence  Am  in 
G  converges  to  A  E  G,  then 

lim  ||n(Am)^  —  n(A),0||  =  0 

m— t>oo 

for  all  if  G  H. 


We  can  attempt  to  associate  to  a  unitary  representation  n  of  G  some 
sort  of  representation  7 r  of  the  Lie  algebra  $  of  G,  by  imitating  the  con¬ 
struction  in  Theorem  16.23.  For  any  X  E  the  map  t  ^  H(etx)  is  a 
strongly  continuous  one-parameter  unitary  group.  Thus,  Stone’s  theorem 
(Theorem  10.15)  tells  us  that  there  exists  a  unique  self-adjoint  operator  A 
such  that  H(etx)  =  eltA  for  all  t  G  R.  If  we  let  i t(X)  denote  the  skew- self- 
adjoint  operator  iA,  we  will  have 

U(etx)  =  et7r{x).  (16.5) 


The  operators  7r(X),  X  E  £j,  are  in  general  unbounded  and  defined  only 
on  a  dense  subspace  of  H.  Nevertheless,  it  can  be  shown  (see,  e.g.,  [43]) 
that  there  exists  a  dense  subspace  V  of  H  contained  in  the  domain  of 
each  7 r(X)  and  that  is  invariant  under  each  7r(X),  and  on  which  we  have 
7r([X,  Y])  =  [7r(X),7 r(T)].  In  the  case  of  the  particular  representation  that 
we  will  consider  in  the  next  chapter,  we  can  avoid  these  difficulties  by 
looking  at  finite-dimensional  invariant  subspaces. 
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Proposition  16.54  Suppose  G  is  a  matrix  Lie  group  and  II  :  G  U(H)  is 
a  unitary  representation  of  G.  For  each  X  £  g,  let  7r(X)  denote  the  operator 
in  (16.5).  Suppose  V  C  H  is  a  finite- dimensional  subspace  of  H  such  that 
11(A)  maps  V  into  V,  for  all  A  £  G.  Then  for  all  X  £  g,  V  C  Dom(7r(X)); 
7 t(X)  maps  V  into  V,  and  we  have 


n([X,Y])v 


7T(X),7T  (y)]v 


(16.6) 


for  all  v  £  V. 

In  the  other  direction,  suppose  G  is  connected  and  suppose  V  is  any 
finite- dimensional  subspace  of  H  such  that  for  all  X  £  g,  V  C  Dom(7r(X)) 
and  7 t(X)  maps  V  into  V.  Then  11(A)  also  maps  V  into  V,  for  all  A  £  G. 

Proof.  Since  V  is  invariant  under  both  11(A)  and  11(A)*  =  II(A_1),  the 
restriction  to  V  of  each  II (A)  is  unitary.  The  operators  11(A)  |  form  a 
finite-dimensional  unitary  representation  of  G  that  is  strongly  continuous 
and  thus  continuous.  (In  the  finite-dimensional  case,  all  reasonable  notions 
of  continuity  for  representations  coincide.)  For  each  X  £  g,  Theorem  16.18 
tells  us  that  there  is  an  operator  X  on  V  such  that 

U(etx)\v  =  etx. 

Thus,  for  any  v  £  V,  we  have 


lim 

t — 


t  v 

e^v  _  v 

Inn - 

t — >-0  t 


This  calculation  shows  that  v  is  in  the  domain  of  the  infinitesimal  gener¬ 
ator  7 t(X)  of  the  unitary  group  Yl(etx),  and  that  7 t(X)v  =  Xv.  Since  the 
operators  X,  X  £  g,  form  a  representation  of  g,  we  have  the  relation  (16.6). 

In  the  other  direction,  if  V  is  invariant  under  7r(X),  the  restriction  of 
7r(X)  to  V  is  automatically  bounded.  Thus,  there  is  a  constant  C  such  that 


7r(X)mv 


<  C 


m 


V 


(16.7) 


for  all  v  £  V.  If  we  use  the  direct-integral  form  of  the  spectral  theorem 
for  the  self-adjoint  operator  A  :=  —  77r(X),  it  is  easy  to  see  that  (16.7)  can 
only  hold  if  v,  viewed  as  an  element  of  the  direct  integral,  is  supported  on 
a  bounded  interval  inside  the  spectrum  of  A.  Since  the  power  series  of  the 
function  A  etx  converges  to  etx  uniformly  on  any  finite  interval,  we  will 
have 


oo 


E 

m= 0 


tm7T(X)m 


V. 


ml 


Each  term  in  the  above  power  series  belongs  to  V,  which  is  finite  dimen¬ 
sional  and  thus  closed.  We  conclude  that  n(etx)v  belongs  to  V  for  all 
X  £  g.  Since  G  is  connected,  each  element  of  G  is  a  product  of  exponen¬ 
tials  of  Lie  algebra  elements,  and  we  have  the  claim.  ■ 
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16.9.2  Projective  Unitary  Representations 

Given  a  Hilbert  space  H,  let  Su  denote  the  unit  sphere  in  H,  that  is,  the 
set  of  vectors  with  norm  1.  Let  PH  be  the  quotient  space  (5H)/  where 
denotes  the  equivalence  relation  in  which  u  ~  v  if  and  only  if  u  =  el6v 
for  some  6  G  R.  The  quotient  map  q  :  Su  PH  induces  a  topology  on 
PH  in  which  a  set  U  C  PH  is  open  if  and  only  if  q  1(U)  is  open  as  a 
subset  of  the  metric  space  Su  C  H. 

As  in  the  finite-dimensional  case,  we  can  form  the  quotient  group 

PU(H)  :=  U(H )/{ei6I}. 

The  action  of  U(H)  on  Su  descends  to  a  well-defined  action  of  PU(H) 
on  PH. 


Definition  16.55  A  projective  unitary  representation  of  a  matrix  Lie 
group  G  is  a  homomorphism  n  :  G  —>  PU(H),  for  some  Hilbert  space  H, 
with  the  property  that  if  a  sequence  Am  in  G  converges  to  A  in  G,  then 

n (Am)x  n (A)x 

for  all  x  G  PH. 

Recall  that  in  the  finite-dimensional  case,  every  projective  unitary  rep¬ 
resentation  of  G  can  be  “de-projectivized”  at  the  expense  of  possibly  having 
to  pass  to  the  universal  cover  G  of  G  (Theorem  16.47).  The 
de-projectivization  proceeds  by  passing  to  the  Lie  algebra,  choosing  the 
trace-zero  representative  of  each  equivalence  class,  and  then  exponentiat¬ 
ing  back  to  the  universal  cover  of  the  original  group.  This  approach  does 
not  work  in  the  infinite-dimensional  case.  After  all,  even  assuming  we  can 
construct  a  Lie  algebra  homomorphism  i t(X)  for  each  X  G  the  repre¬ 
sentatives  of  7r(X)  are  typically  unbounded  operators  on  H,  for  which  the 
notion  of  trace  does  not  make  sense.  This  difficulty  is  not  just  a  technical¬ 
ity;  the  corresponding  result  in  the  infinite-dimensional  case  is  false,  as  we 
will  now  see. 


Example  16.56  For  all  (a,  b)  G  M2,  define  an  operator  T(a,6)  on  L2(R)  by 


(T(a, b)^)(x)  =  eiaxj)[x  -  b). 
Then  T(a,6)  is  unitary  for  all  (a,  b)  G  M2  and  we  have 


{T(a,b)T( 


( a '  ,b' 


(x)  =  eiaxeia'{x~b)il)(x  -  (b  +  b')) 


=  e~ia'b  {T(a+a',b+b')V  (*)•  (16.8) 


The  map  ( a,b )  [TVa  m]  is  a  homomorphism  of  R2  into  PU(L2(R)),  and 
this  homomorphism  is  continuous  in  the  sense  of  Definition  16.55.  There 
does  not,  however,  exist  any  homomorphism  S  :  M2  U(L2(M))  such  that 
[-5(0,6)]  =  [T(a, 6)]  for  all  ( a,b )  E  M2. 
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Thus,  even  though  M2  is  simply  connected  (and  thus  its  own  universal 
cover),  there  is  no  way  to  de-projectivize  the  projective  unitary  represen¬ 
tation  (a,  b)  i— >>  [T(a?&)]  of  M2. 

Proof.  The  map  (a,  b )  T^b)  is  easily  seen  to  be  strongly  continuous, 
and  thus  the  map  (a,  b)  i— >>  [T(a^)]  is  continuous  in  the  sense  of  Defini¬ 
tion  16.55.  If  a  homomorphism  S  with  the  indicated  properties  existed, 
then  there  would  be  constants  0a ^  such  that  $(a,6)  =  el°a,bT(a^)-  But  then 
since  S'  is  a  homomorphism  from  the  commutative  group  M2  into  U(L  2W), 
the  operator  S(a^)  would  have  to  commute  with  S(a/^')  for  all  (a,  b)  and 
(a',b').  But  then  the  operators  T(a,b)  and  being  constant  multiples 

of  commuting  operators,  would  need  to  commute  as  well.  But  this  is  not  the 
case;  for  example,  T(a,o)  does  not  commute  with  XT  &/),  as  is  easily  verified 
using  (16.8).  ■ 

Despite  the  negative  result  in  Example  16.56,  there  is  a  positive  result  in 
this  direction:  If  G  is  connected  and  “semi-simple,”  every  projective  unitary 
representation  of  G  can  be  de-projectivized  after  passing  to  the  universal 
cover.  Here,  a  Lie  algebra  g  is  said  to  be  simple  if  g  has  no  nontrivial  ideals 
and  dim£j  >  2.  A  Lie  algebra  is  said  to  be  semi- simple  if  it  is  a  direct  sum 
of  simple  algebras.  Linally,  a  Lie  group  G  is  said  to  be  semi-simple  if  the 
Lie  algebra  £j  of  G  is  semi-simple. 

For  any  connected  Lie  group  G,  a  projective  unitary  representation  n  of 
G  can  be  de-projectivized  by  passing  to  a  one- dimensional  central  exten¬ 
sion.  A  one-dimensional  central  extension  of  G  is  a  Lie  group  Gr  together 
with  a  surjective  homomorphism  <f>  :  Gr  — )►  G  such  that  the  kernel  of  T  is 
one-dimensional  and  contained  in  the  center  of  G' .  See  the  article  [1]  of  V. 
Bargmann  for  more  information  about  these  issues. 


16.10  Exercises 


1.  Suppose  that  G  is  a  connected  matrix  Lie  group  and  that  N  is  a 
discrete  normal  subgroup  of  G,  meaning  that  there  is  some  neighbor¬ 
hood  U  of  I  in  G  such  that  U  D  N  =  {/}.  Show  that  N  is  contained 
in  the  center  of  G. 

Hint :  Consider  the  quantity  gng~x  for  g  G  G  and  n  E  N. 


2.  (a)  Suppose  two  elements  U  and  V  of  SU(2)  commute.  Show  that 

each  eigenspace  for  U  is  invariant  under  V  and  vice  versa. 

(b)  Show  that  if  U  is  in  the  center  of  SU(2),  then  U  =  I  or  U  =  —I. 


3. 


Define  the  Hilbert- Schmidt  norm  of  a  matrix  X 
formula 

n 


A 


2 

HS 


1 


G  Mn(C)  by  the 
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Using  the  Cauchy-Schwarz  inequality,  show  that 


XY 


HS  — 


HS 


(16.9) 


for  all  X,Y  E  Mn(C). 

4.  Using  term-by-term  differentiation  of  power  series,  show  that  for  all 
X  £  Mn(C)  and  all  1  <  j,  k  <  n,  we  have 


d_ 

dt 


5.  Verify  Property  4  of  Theorem  16.15.  This  should  be  easy  in  the  case 
that  X  is  diagonalizable.  In  the  general  case,  either  use  the  Jordan 
canonical  form  or  appeal  to  the  fact  that  diagonalizable  matrices  are 
dense  in  Mn(C). 

6.  Suppose  X  and  Y  are  commuting  n  x  n  matrices.  Show  that 

XY  V+Y 

e  e  —  e 


This  is  Property  5  of  Theorem  16.15. 

Hint :  Multiply  together  the  power  series  for  ex  and  eY  and  then 
group  terms  where  the  total  power  of  X  and  Y  is  n. 


7.  For  A  £  Mn(C),  define  the  logarithm  of  A  by  the  power  series 


log  A  =  A  —  I 


(A -I)2  (A -I)3 

2  +  3 


whenever  this  series  converges.  Assume  the  following  result:  If  A  is 
sufficiently  close  to  /,  then  log  A  is  defined  and  exp(logA)  =  A. 
[This  can  be  seen  easily  when  A  is  diagonalizable,  and  the  set  of 
diagonalizable  matrices  is  dense  in  Mn(C).] 


Show  that  there  exists  a  constant  C  such  that  for  all  A  with 
||  A  —  I\\  <  1/2  we  have 


| log  A  —  (A  —  7)||  <C\\A-I 


(b)  Show  that  for  all  1,7  G  Mn(C)  we  have 


g  Y/  m 


X  Y  _ 

— - 1 - h  O 

m  m 


(16.10) 


Note  that  ex/meY/m  tends  to  I  as  m  tends  to  infinity,  so  that 
the  left-hand  side  of  (16.10)  is  defined  for  all  sufficiently  large  m. 

(c)  Prove  the  Lie  Product  Formula. 
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8.  (a)  Show  that  for  all  1,7  G  Mn(C), 


+ <y) 


m 


<  m 

X 

m—  1 

Y 

1=0 

(b)  Show  that  the  map  X  i— >>  etx  is  a  continuously  differentiable 
map  of  Mn(C)  =  M2n  to  itself. 

(c)  Using  Exercise  4,  show  that  the  differential  of  the  map  X  ex 
at  X  =  0  is  the  identity  map  of  Mn(C)  to  itself.  (Recall  that  the 
differential  of  smooth  map  of  W  to  evaluated  at  a  point  in 

,  is  a  linear  map  of  W  to  Mfc.) 

9.  Suppose  0  is  a  Lie  algebra  and  t)  is  an  ideal  in  0.  Let  £j/t)  denote  the 
vector  space  quotient  of  0  by  t).  Show  that  the  bracket  on  0  descends 
unambiguously  to  a  bilinear  map  on  £j/t),  and  that  £j/t)  forms  a  Lie 
algebra  under  this  map. 


10.  Suppose  that  Gi,  G 2,  and  G 3  are  matrix  Lie  groups  with  Lie  algebras 
0i,  02,  and  03,  respectively.  Suppose  that  4>  :  Gi  G2  and  4/  : 
G2  G3  are  Lie  group  homomorphisms  with  associated  Lie  algebra 
homomorphisms  <fi  and  t/j,  respectively.  Show  that  the  Lie  algebra 
homomorphism  associated  to  4/  o  <f>  :  Gi  G3  is  ip  o  <fi. 


11.  Show  that  isomorphic  matrix  Lie  groups  have  isomorphic  Lie  alge¬ 
bras. 


12.  Suppose  G\  and  G 2  are  matrix  Lie  groups  with  Lie  algebras  0i  and 
02,  respectively.  Suppose  :  G\  —>  G2  is  a  Lie  group  homomorphism 
with  the  property  that  the  associated  Lie  algebra  homomorphism 
•  0i  02  is  injective.  Show  that  there  exists  a  neighborhood  U  of 
the  identity  in  G\  such  that  U  D  ker  =  {/}. 

Hint :  Use  Theorem  16.25. 


13.  (a)  Show  that  every  R  E  SO (3)  has  an  eigenvalue  of  1. 

(b)  Show  that  every  R  E  SO (3)  is  conjugate  in  SO (3)  to  matrix  of 
the  form 

/  1  °  0  \ 

I  0  cos  6  —  sin  6  J 

y  0  sin  6  cos  6  ) 

for  some  0  E  R. 


(c)  Show  that  the  exponential  map  from  so(3)  to  SO (3)  is  surjective. 

(d)  Show  that  SO (3)  is  connected. 


14.  Show  that  the  center  of  SO (3)  is  trivial. 
Hint :  Use  Part  (a)  of  Exercise  13. 
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15.  Given  a  Lie  algebra  g,  let  [g,  g]  denote  the  space  of  linear  combinations 
of  commutators,  that  is,  the  space  spanned  by  elements  of  the  form 
[X,Y]  with  X,Y  E  g. 

(a)  Show  that  [g,g]  is  an  ideal  in  g  and  that  the  quotient  g/[g,g] 
is  commutative.  (The  ideal  [g,g]  is  called  the  commutator  ideal 
of  g.) 

(b)  If  g  =  so(3),  show  that  [g,g]  =  g. 

(c)  If  7T  :  g  g\(V)  is  any  finite-dimensional  representation  of  g, 
show  that  7r([g,g])  is  contained  in  sl(U),  the  space  of  endomor- 
phisms  of  V  with  trace  zero. 

16.  (a)  Show  that  the  Lie  algebra  pu(n)  =  u(n)/{mM}  is  isomorphic  to 

the  Lie  algebra  su(n). 

(b)  Let  {e27r2fe/n/}  denote  the  group  of  matrices  that  are  of  the  form 
of  an  nth  root  of  unity  times  the  identity.  Show  that  the  group 
PU(n)  is  isomorphic  to  SU (n)  /  {e27Zlk/nI}. 

IT.  Suppose  that  G  is  a  matrix  Lie  group  with  Lie  algebra  g  and  that 
A  is  an  element  of  G.  Show  that  the  operation  of  left  multiplication 
by  A-1  is  a  diffeomorphism  of  Mn(C).  Now  show  that  there  exist 
neighborhoods  U  of  0  in  Mn( C)  and  V  of  A  in  Mn(C)  such  that  the 
map  X  Aex  maps  U  diffeomorphically  onto  V  and  such  that  for 
X  E  17,  we  have  X  E  g  if  and  only  if  Aex  E  V.  (Use  Theorem  16.25.) 

18.  Suppose  that  Z  E  Mn(C)  has  the  property  that  ZX  =  XZ  for  all 
X  E  Mn(C).  Show  that  Z  =  cl  for  some  c  E  C. 

19.  Suppose  (II,  H)  is  a  unitary  representation  of  a  matrix  Lie  group 
G,  and  suppose  V\  and  V2  are  finite-dimensional  irreducible  invari¬ 
ant  subspaces  of  H.  Show  that  if  V\  and  V2  are  not  isomorphic  as 
representations  of  G ,  then  V\  is  orthogonal  to  V2  inside  H. 

Hint :  Show  that  the  orthogonal  projection  of  H  onto  VT  or  V2  is  an 
intertwining  map,  and  use  Schur’s  lemma. 


17 

Angular  Momentum  and  Spin 


17.1  The  Role  of  Angular  Momentum 
in  Quantum  Mechanics 

Classically,  angular  momentum  may  be  thought  of  as  the  Hamiltonian 
generator  of  rotations  (Proposition  2.30).  Angular  momentum  is  a  particu¬ 
larly  useful  concept  when  a  system  has  rotational  symmetry,  since  in  that 
case  the  angular  momentum  is  a  conserved  quantity  (Proposition  2.18). 
Quantum  mechanically,  angular  momentum  is  still  the  “generator”  of  ro¬ 
tations,  meaning  that  it  is  the  infinitesimal  generator  of  a  one-parameter 
group  of  unitary  rotation  operators,  in  the  sense  of  Stone’s  theorem  (The¬ 
orem  10.15).  The  quantum  angular  momentum  is  again  conserved  in  sys¬ 
tems  with  rotational  symmetry.  This  means  that  if  the  Hamiltonian  H  is 
invariant  under  rotations,  then  H  commutes  with  the  angular  momentum 
operators,  in  which  case,  the  angular  momentum  operators  are  constants 
of  motion  in  the  quantum  mechanical  sense. 

The  various  components  of  the  classical  angular  momentum  vector  for 
a  particle  in  M3  satisfy  certain  simple  commutation  relations  under  the 
Poisson  bracket  (Exercise  19  in  Chap.  2).  We  will  see  that  those  relations  are 

the  commutation  relations  for  the  Lie  algebra  so(3)  of  the  rotation  group 

/\ 

SO (3).  If  H  commutes  with  each  component  of  the  angular  momentum, 
each  eigenspace  for  H  (the  solution  space  to  Hip  =  A  ip  for  a  given  A)  is 
invariant  under  the  angular  momentum  operators.  Thus,  the  eigenspace 
constitutes  a  representation  of  the  Lie  algebra  so(3).  By  classifying  the 
irreducible  (finite-dimensional)  representations  of  so(3),  we  can  obtain  a  lot 
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of  information  about  the  structure  of  the  solution  spaces  to  the  equation 
Hip  =  \ip,  in  the  case  that  H  is  invariant  under  rotations.  Specifically,  the 
representation  theory  of  so(3)  allows  us  to  determine  completely  the  angular 
dependence  of  a  solution  ip(x),  leaving  only  the  radial  dependence  of  ip  to 
be  determined.  This  has  the  effect  of  reducing  the  number  of  independent 
variables  from  three  to  one  (just  the  radius  r  in  polar  coordinates),  thereby 
reducing  the  problem  to  solving  an  ordinary  differential  equation. 

Understanding  angular  momentum  from  the  point  of  view  of  representa¬ 
tions  of  a  Lie  algebra  also  prepares  us  to  understand  the  concept  of  spin. 
The  Hilbert  space  for  a  particle  in  M3  with  spin  is  the  tensor  product 
of  L2(M3)  with  a  finite-dimensional  vector  space  V,  where  V  carries  an 
irreducible  action  of  the  rotation  group  SO (3).  In  this  setting,  the  proper 
notion  of  “action”  is  a  projective  representation  of  S0(3),  meaning  a  family 
of  operators  satisfying  the  relations  of  SO (3)  up  to  phase  factors  (constants 
of  absolute  value  one).  These  phase  factors  are  permitted  because,  physi¬ 
cally,  two  vectors  that  differ  only  by  a  constant  represent  the  same  physical 
state.  By  Proposition  16.46,  every  projective  representation  of  S0(3)  can 
be  de-projectivized  at  the  level  of  the  Lie  algebra  so(3).  Conversely,  every 
irreducible  ordinary  representation  of  the  Lie  algebra  so(3)  gives  rise  to  a 
representation  of  the  universal  cover  SU(2)  of  SO (3),  which  in  turn  gives 
rise  (Theorem  16.47)  to  a  projective  representation  of  S0(3).  Thus,  the 
possibilities  for  the  space  V  are  in  one-to-one  correspondence  with  the  irre¬ 
ducible  representations  of  the  Lie  algebra  so(3).  In  the  case  of  “half-integer 
spin,”  the  space  V  does  not  carry  an  ordinary  representation  of  the  group 
S0(3). 


17.2 


The  Angular  Momentum  Operators  in 


Recall  from  Sect.  2.4  that  the  classical  angular  momentum  for  a  particle  in 
M3  is  given  by  J  =  x  x  p,  so  that,  say,  J3  =  X1P2  —  X2Pi-  As  in  Sect.  3.10, 

A 

we  introduce  the  quantum  mechanical  counterpart,  a  “vector”  J  with  com¬ 
ponents  that  are  operators, 


J  =  X  x  P. 

/\ 

Thus,  for  example,  J\  =  X2P3  —  X3P2.  Note  that  each  component  of  the 
angular  momentum  involves  products  of  distinct  components  of  the  po¬ 
sition  and  momentum  operators  X  and  P,  which  commute.  Thus,  in  the 

/\ 

expression  for,  say,  J3,  it  does  not  matter  whether  we  write  X2P3  or  P3X2. 

The  angular  momentum  operators  are  unbounded  operators  and  are  de¬ 
fined  only  on  a  dense  subspace  of  L2(M3).  For  the  moment,  we  will  not 
specify  the  domain  of  these  operators,  leaving  that  until  the  next  section. 
We  will  see,  however,  that  the  domain  of  each  angular  momentum  operator 
contains  the  Schwartz  space  <S(M3)  (Definition  A.  15). 
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As  in  Exercise  10  in  Chap.  3,  we  can  use  the  canonical  commutation 

/\  /\  /\  /\  /\ 

relations  to  obtain  [Ji,  J2]  =  ihJ 3.  We  may  similarly  compute  [J2,  J3]  and 
[Ji,  J2]  to  obtain  the  complete  set  of  commutation  relations  among  the  J’s: 


1 

ih 


These  relations  compare  well  with  the  Poisson  bracket  relations  among  the 
various  components  of  the  classical  angular  momentum  vector  (Exercise  19 
in  Chap.  2). 

Writing  out  J3  explicitly,  we  have 


(J3V>)(x)  =  - ih 


ih  Jma?x) 


(9=0 


AX) 


(17.1) 

(17.2) 


where  Re  denotes  a  counterclockwise  rotation  by  angle  6  in  the  (£i,£2) 
plane,  with  similar  expression  for  J\  and  J2.  This  description  of  the  angu¬ 
lar  momentum  operators  demonstrates  that  they — like  the  components  of 
the  classical  angular  momentum — are  closely  connected  to  rotations  (recall 
Propositions  2.18  and  2.30).  The  connection  between  angular  momentum 
and  rotations  will  be  made  more  explicit  in  the  following  sections  by  recog¬ 
nizing  that  they  make  up  the  Lie  algebra  action  associated  with  the  natural 
action  of  the  rotation  group  on  L2(M3). 

We  may  define  a  new  version  of  the  angular  momentum  operators  Jj, 
given  by 

Jj  =  —  Jj.  (17-3) 

Since  Planck’s  constant  and  angular  momentum  have  the  same  units,  the 
Jj’ s  do  not  depend  on  the  choice  of  units;  we  refer  to  them  as  the  dimen¬ 
sionless  versions  of  the  angular  momentum  operators. 


17.3  Angular  Momentum  from  the  Lie  Algebra 
Point  of  View 

We  begin  this  section  by  looking  at  the  natural  action  of  the  rotation  group 
SO(3)  on  L2(M3). 

Definition  17.1  For  each  R  E  SO(3),  define  11(17)  :  L2(M3)  -A  L2(M3)  by 

(n  (R)i/>)(x)  =  ^(R^x).  (17.4) 

Proposition  17.  2  For  each  R  E  SO(3),  the  map  11(17)  :  L2(M3)  -a  L2(M3) 
is  unitary.  Furthermore,  the  map  II  :  SO (3)  -A  U(L2(M3))  is  a  strongly 
continuous  homomorphism. 
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Proof.  Since  the  Lebesgue  measure  on  M3  is  invariant  under  rotations, 
II (R)  is  unitary  for  all  R  £  SO (3).  It  is  easily  checked  that  H(RiR2)  = 
n(i?i)n(i?2);  for  this  to  be  true,  we  need  to  have  ip(R~1x)  rather  than 
ip(Rx)  in  the  definition  of  11(77).  Arguing  as  in  the  proof  of  Example  10.12, 
we  can  easily  verify  that  II  is  strongly  continuous.  ■ 

Recall  the  computation  of  the  Lie  algebra  so(3)  of  SO(3)  in  Sect.  16.5, 
and  the  basis  {Ei,  .F2,  E3}  for  so(3)  in  (16.2)  in  that  section. 

Proposition  17.3  For  each  X  £  so(3),  let  1 t(X)  denote  the  skew-self- 
adjoint  operator  such  that 

U(etx)  =  et7r(x).  (17.5) 

Then  the  domain  of  each  7r(Fj)  contains  the  Schwartz  space  <S(M3)  and  on 
<S(M3)  we  have  the  relation 

Jj  =  ihTT{Fj). 

In  the  notation  of  Stone’s  theorem  (Theorem  10.15),  the  operator  7 t(X) 
in  (17.5)  is  i  times  the  infinitesimal  generator  of  the  one-parameter  unitary 
group  t  ha  n(etx). 

Proof.  In  the  case  of  J3,  we  compute  as  in  Example  16.16  that  etFs  is  a 
counterclockwise  rotation  in  the  (#1,  X2)-plane.  If  pj  belongs  to  S(R 3)  then 
the  limit  defining  the  derivative  in  (17.2)  is  easily  seen  to  hold  in  the  L 2 

sense.  Thus,  recalling  the  inverse  on  the  right-hand  side  of  (17.4),  we  see 

/\ 

that  J3  coincides  with  ibn^F^),  as  claimed.  Similar  calculations  apply  to 
J\  and  J2.  ■ 

Although  it  is  not  easy  to  determine  the  precise  domain  of  each  angular 
momentum  operator,  we  can  see  from  Proposition  16.54  that  if  ip  belongs 
to  a  finite-dimensional  subspace  of  L2(M3)  that  is  invariant  under  rotations, 
then  ip  belongs  to  the  domain  of  each  Jj. 


17.4  The  Irreducible  Representations  of  so(3) 

In  this  section,  we  classify  the  irreducible  finite-dimensional  representations 
of  the  Lie  algebra  so(3),  up  to  isomorphism.  (See  Sect.  16.7  for  the  defini¬ 
tions  and  elementary  properties  of  representations.)  All  representations  are 
taken  over  the  field  of  complex  numbers  and  assumed  to  have  dimension 
at  least  one.  We  continue  to  use  the  basis  {Ei,  .F2,  E3}  for  so(3)  in  (16.2). 

Theorem  17.4  Let  it  :  so(3)  -A  gl(P)  be  a  finite- dimensional  irreducible 
representation  of  so(3).  Define  operators  L+,  L-,  and  L%  on  V  by 

L+  =  inXV  -  n(F2) 

L~  =  m(Fi)  +  7t(F2) 

L3  =  m(F3). 
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Let  l  =  |(dimR  —  1),  so  that  dimR  =  21  +  1.  Then  there  exists  a  basis 
vo,  v\, . . . ,  V21  ofV  such  that 


L3Vj 


L  vj 
L+Vj 


(l  -  j)Vj 

J  vj+ 1  3  <%l 

\  0  if  j  =  21 

f  j{2l  +  1  —  j)vj-i 

1  o 


if  3  >  0 
if  3  =  0 


(17.6) 


Thus,  the  quantity  l  completely  determines  the  structure  of  an  irreducible 
representation  of  so(3).  Since  dim!/  is  a  positive  integer,  l  has  to  have  one 
of  the  following  values: 


1 

<  =  0,2 


(17.7) 


The  proof  of  Theorem  17.4  is  given  later  in  this  section. 


Definition  17.5  If  (7T,  V)  is  an  irreducible  finite- dimensional  representa¬ 
tion  of  so(3),  then  the  spin  of  (tt,  V )  is  the  largest  eigenvalue  of  the  operator 
Lo  :=  iir(Fs).  Equivalently,  l  is  the  unique  number  such  that  dim  V  =  2Z  +  1. 

Our  next  result  says  that  all  the  values  of  l  in  (17.7)  actually  arise  as 
spins  of  irreducible  representations  of  so(3). 

Theorem  17.6  For  any  l  =  0,  2?  1?  f  ?  •  •  •  there  exists  an  irreducible  repre¬ 
sentation  of  so(3)  of  dimension  21  +  1,  and  any  two  irreducible  representa¬ 
tions  of  so(3)  of  dimension  21  +  1  are  isomorphic. 

Note  that  the  theorem  is  only  asserting  the  existence,  for  each  Z,  of  a 
representation  of  the  Lie  algebra  so(3).  As  we  will  see  in  the  next  section, 
an  irreducible  representation  7r  of  so(3)  comes  from  a  representation  II  of 
SO (3)  if  and  only  if  l  is  an  integer.  Nevertheless,  the  representations  of 
so(3)  with  half-integer  values  of  l — the  ones  where  l  is  half  of  an  integer 
but  not  an  integer — still  play  an  important  role  in  quantum  physics,  as 
discussed  in  Sect.  17.8.  (Although  it  would  be  clearer  to  refer  to  the  case 
l  =  1/2,  3/2,  5/2, ...  as  “integer  plus  a  half,”  the  terminology  “half-integer” 
is  firmly  established.) 

By  comparison  to  Proposition  17.3,  we  may  think  of  L3  as  the  analog 
of  the  third  component  of  the  dimensionless  angular  momentum  operator 
on  the  space  V.  Indeed,  we  will  eventually  be  interested  in  applying  Theo¬ 
rem  17.4  to  the  case  in  which  V  is  a  subspace  of  L2(M3)  that  is  invariant 
under  the  action  of  SO (3).  In  that  case,  L 3  will  be  precisely  (the  restriction 
to  V  of)  the  dimensionless  angular  momentum  operator  J3. 

Observe  that  Theorem  17.4  bears  a  strong  similarity  to  our  analysis  of 
the  quantum  harmonic  oscillator.  In  both  cases,  we  have  a  “chain”  of  eigen¬ 
vectors  for  a  certain  operator,  along  with  raising  and  lowering  operators 
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that  raise  and  lower  the  eigenvalue  of  that  operator.  In  the  case  of  the 
harmonic  oscillator,  we  have  a  chain  that  begins  with  a  ground  state  and 
then  extends  infinitely  in  one  direction.  In  the  case  of  so(3)  representations, 
we  have  a  chain  that  is  finite  in  both  directions.  The  chain  begins  with  an 
eigenvector  for  L3  with  maximal  eigenvalue,  so  that  is  annihilated 
by  the  raising  operator  L+ .  A  key  step  in  the  proof  of  Theorem  17.4  is  to 
determine  how  the  chain  can  terminate  (in  the  direction  of  lower  eigenval¬ 
ues  for  L3)  without  violating  the  commutation  relations  among  L3,  L+, 
and  L~ . 

Proof  of  Theorem  17.4.  Since  n  is  a  Lie  algebra  homomorphism,  the 
7t(Fj )  ’s  satisfy  the  same  commutation  relations  as  the  Fj’ s  themselves. 
From  this  we  can  easily  verify  the  following  relations  among  the  operators 
L+,  L_,  and  L3: 


[L3,L+]  =  l+ 


[L3,L~] 

[L+,L-] 


—L~ 

2L3. 


(17.8) 

(17.9) 
(17.10) 


Now,  since  we  are  working  over  the  algebraically  closed  field  C,  the  operator 
L3  has  at  least  one  eigenvector  v  with  eigenvalue  A.  Consider,  then,  L+v. 
Using  (17.8),  we  compute  that 

L3L+v  =  ( L+L3  +  L+)v  =  L+{ \v)  +  L+v  =  (A  +  1  )L+v.  (17.11) 

Thus,  either  L+v  =  0  or  L+v  is  an  eigenvector  for  L3  with  eigenvalue 
A  +  1.  We  call  L+  the  “raising  operator,”  since  it  has  the  effect  of  raising 
the  eigenvalue  of  L3  by  1. 

If  we  apply  L+  repeatedly  to  r,  we  obtain  eigenvectors  for  L3  with  eigen¬ 
values  increasing  by  1  at  each  step,  as  long  as  we  do  not  get  the  zero  vector. 
Eventually,  though,  we  must  get  0,  since  the  operator  L3  has  only  finitely 
many  eigenvalues.  Thus,  there  exists  k  >  0  such  that  (L+)kv  ^  0  but 
(L+)fc+W  =  0.  By  applying  (17.11)  repeatedly,  we  see  that  (. L+)kv  is  an 
eigenvector  for  L3  with  eigenvalue  A  +  k. 

Let  us  now  introduce  the  notation  :=  ( L+)kv  and  fi  =  A  +  k.  Then 
is  a  nonzero  vector  with  L+v 0  =  0  and  =  fiv 0.  We  now  forget  about 
the  original  vector  v  and  eigenvalue  A  and  consider  only  Vq  and  fi.  Define 
vectors  Vj  by 

vj  =  (L~)Jv  0,  j  =  0,1,2,.... 

Arguing  as  in  (17.11),  but  using  (17.9)  in  place  of  (17.8),  we  see  that  L~ 
has  the  effect  of  either  lowering  the  eigenvalue  of  L3  by  1  or  of  giving  the 
zero  vector.  Thus,  L%Vj  =  [fi  —  j)vj. 

Next,  we  claim  that  for  j  >  1  we  have 


L+Vj  =  j(2M  +  1  -j)vj, 


j  —  1,2,3,..., 


(17.12) 
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which  is  easily  proved  by  induction  on  j,  using  (17.10)  (Exercise  2).  Since, 
again,  L3  has  only  finitely  many  eigenvectors,  Vj  must  eventually  be  zero. 
Thus,  there  exists  some  N  >  0  such  that  vjy  7^  0  but  vn+i  =  0.  Since 
?hv+i  =  0,  applying  (17.12)  with  j  =  N  gives 

0  =  L+v  tv+1  =  (N  +  1)(2  fjb  —  N)vn. 

Since  vjy  7^  0  and  N  +  1  >  0,  we  must  have  (2 p  —  N )  =  0.  This  means  that 
(i  must  equal  N j  2. 

Letting  l  =  N/2  and  putting  p  =  N/2  =  /,  we  have  the  formulas  recorded 
in  (17.6).  Meanwhile,  since  the  s  are  eigenvectors  for  L3  with  distinct 
eigenvalues,  the  Vj  s  are  automatically  linearly  independent.  Furthermore, 
the  span  of  the  Vj  s  is  invariant  under  L+,  L~ ,  and  L3,  hence  under  all  of 
so(3).  Since  V  is  assumed  to  be  irreducible,  the  span  of  the  Vj’ s  must  be 
all  of  V.  Thus,  the  Vj  s  form  a  basis  for  V.  The  dimension  of  V  is  therefore 
equal  to  the  number  of  s,  which  is  TV  -f  1  =  2/  -f-  1.  ■ 

Proof  of  Theorem  17.6.  We  construct  V  simply  by  defining  a  space 
V  with  basis  vq,  iq, . . . ,  V21  and  defining  the  action  of  so(3)  by  (17.6).  It 
is  a  simple  matter  (Exercise  4)  to  check  that  L+,  L-,  and  L3,  defined  in 
this  way,  have  the  correct  commutation  relations,  so  that  V  is  indeed  a 
representation  of  so(3). 

It  remains  to  show  that  V  is  irreducible.  Suppose  that  W  is  an  invariant 
subspace  of  V  and  that  W  7^  {0}.  We  need  to  show  that  W  =  V.  To 
this  end,  suppose  that  w  is  some  nonzero  element  of  W,  which  we  can 
decompose  as  w  =  Y^jLo  ajvj-  do  be  the  largest  index  for  which  a3  is 
nonzero.  According  to  the  formula  for  L+  in  (17.6),  applying  L+  to  any 
of  the  vectors  rq, . . . ,  V21  gives  a  nonzero  multiple  of  the  previous  element 
in  our  chain.  Thus,  (L+)J0rc  will  be  a  nonzero  multiple  of  vq.  Since  W 
is  invariant,  this  means  that  belongs  to  W.  But  then  by  applying  L~ 
repeatedly,  we  see  that  Vj  belongs  to  W  for  each  j,  so  that  W  =  V. 

Theorem  17.4  tells  us  that  any  irreducible  representation  of  so(3)  of  di¬ 
mension  21  +  1  has  a  basis  as  in  (17.6).  We  can  then  construct  an  isomor¬ 
phism  between  any  two  irreducible  representations  by  mapping  this  basis 
in  one  space  to  the  corresponding  basis  in  the  other  space.  ■ 

In  the  rest  of  this  section,  we  look  at  some  additional  properties  of  rep¬ 
resentations  of  so(3). 

Proposition  17.7  Let  1 r  :  so(3)  — >■  g\(V)  be  an  irreducible  representation 
of  so(3).  Then  there  exists  an  inner  product  on  V,  unique  up  to  multiplica¬ 
tion  by  a  constant,  such  that  7r(X)  is  skew- self- adjoint  for  all  X  E  so(3). 

Proof.  Recalling  how  the  operators  L3,  L+,  and  L~  are  defined,  we  can 
see  that  the  assertion  that  each  7 r(X),  X  G  so(3),  is  skew- self- adjoint  is 
equivalent  to  the  assertion  that  L3  is  self-adjoint  and  that  L+  and  L~ 
are  adjoints  of  each  other.  Since  the  s  are  eigenvectors  for  L3  with  dis¬ 
tinct  eigenvalues,  if  L3  is  to  be  self-adjoint,  the  s  must  be  orthogonal. 
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Conversely,  if  we  have  any  inner  product  for  which  the  Vj’s  are  orthogonal, 
then  L3  will  be  self-adjoint,  as  is  easily  verified. 

It  remains  to  investigate  the  consequences  of  the  condition  (L+)*  =  L~ . 
Assuming  this  condition,  we  compute  that 

=  (L-Vj^uL-Vj-!)  =  {Vj-1,L+L-Vj- 1). 

But  L+ L~  =  L_L+  +  2L3.  Furthermore,  L^Vj-i  =  (l  —  j  +  l)vj-i  and 
L+Vj-i  =  (j  —  1) (2Z  —  j  +  2)vj-i  and,  thus, 

=  (vj-i,  L+ 

=  ( j  —  1)(2 1  —  j  +  2)  (vj— 1,  L  Vj_ 2)  +  2(7  —  j  +  1)  (vj-i,  Vj_i) . 

Recalling  that  L~v1- 2  =  and  simplifying  gives 

=j(2l-j  +  1)  .  (17.13) 

It  is  easy  to  see  that  if  the  fj’s  are  orthogonal,  then  L+  and  L_  are  adjoints 
of  each  other  if  and  only  if  the  normalization  condition  (17.13)  holds  for 
j  =  1,2,...,  21.  Since  j(2l  —  j  +  1)  is  positive  for  each  such  j,  there  is  no 
obstruction  to  normalizing  the  s  so  that  this  condition  holds,  and  so  an 
inner  product  with  the  desired  property  exists.  Since  the  only  freedom  of 
choice  in  defining  the  inner  product  is  the  normalization  of  vo,  the  inner 
product  is  unique  up  to  multiplication  by  a  constant.  ■ 

Proposition  17.8  Suppose  (7 r,  P)  is  an  irreducible  representation  of  so(3) 
of  dimension  21  +  1.  Define  the  Casimir  operator  Cn  E  End(R)  by  the 
formula 

c„  =7r(F1)2+7r(F2)2+7r(F3)2. 

Then  for  all  v  E  V,  we  have 

C^v  =  —1(1  +  l)v. 


Proof.  See  Exercise  3.  ■ 

If  we  look  at  the  proof  of  Theorem  17.4,  we  see  that  the  only  place  in 
which  irreducibility  was  used  is  in  showing  that  the  span  of  r>o ,  v\ , . . . ,  V21 
is  equal  to  V.  We  can  therefore  obtain  the  following  result,  which  will  be 
used  in  Sect.  17.9. 

Proposition  17.9  Let  (tt,  V)  be  any  finite- dimensional  representation  of 
so(3),  not  necessarily  irreducible.  Suppose  vq  is  a  nonzero  element  ofV  such 
that  L+v 0  =  0  and  L3V0  =  Xvq  for  some  A  E  C.  Then  X  is  equal  to  a  non¬ 
negative  integer  or  half-integer  l.  Furthermore,  the  vectors  vq,  v\,  . . . ,  V21 
defined  by 

Vj  =  (L~yv  0,  j  =  0, 1,  ...,2Z, 

span  an  irreducible  invariant  subspace  of  V  of  dimension  2/  +  1,  and  L+ , 
L~ ,  and  L3  act  on  these  vectors  according  to  the  formulas  in  Theorem  17. 4- 
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In  general,  given  a  finite-dimensional  representation  (7 r,  V)  of  a  Lie 
algebra  and  a  nonzero  vector  vq  G  V,  we  say  that  Vq  is  a  cyclic  vec¬ 
tor  for  V  if  the  smallest  invariant  subspace  of  V  containing  v$  is  all 
of  V.  In  Proposition  17.9,  the  vector  Vo  is  certainly  a  cyclic  vector  for 
W  :=  span(?;o, . . . ,  V21).  It  should  be  noted,  however,  that  a  representation’s 
having  a  cyclic  vector  does  not ,  in  general,  mean  that  the  representation 
is  irreducible  (Exercise  5).  Thus,  the  irreducibility  of  W  is  not  the  result 
of  some  general  result  about  cyclic  vectors,  but  holds  only  because  of  the 
assumed  special  properties  of  the  vector  vq. 


17.5  The  Irreducible  Representations  of  S0(3) 

Having  classified  the  irreducible  representations  of  the  Lie  algebra  so(3), 
we  now  turn  to  the  classification  of  the  representations  of  the  group  SO (3). 
Since  S0(3)  is  connected  (Exercise  13  in  Chap.  16),  Proposition  16.39  tells 
us  that  a  representation  of  SO (3)  is  irreducible  if  and  only  if  the  associated 
Lie  algebra  representation  is  irreducible,  and  that  two  representations  of 
SO (3)  are  isomorphic  if  and  only  if  the  associated  Lie  algebra  represen¬ 
tations  are  isomorphic.  Thus,  to  classify  the  irreducible  representations  of 
SO (3)  up  to  isomorphism,  we  merely  have  to  determine  which  irreducible 
representations  of  the  Lie  algebra  so(3)  come  from  a  representation  of  the 
group  SO (3). 

Proposition  17.10  Let  it  1  :  so(3)  —>  g\(V)  be  an  irreducible  representation 
of  so(3),  with  spin  l  :=  |(dim  V  —  1).  If  l  is  an  integer  (i.e.,  if  the  dimension 
of  V  is  odd),  then  there  exists  a  representation  :  S0(3)  GL(H)  such 
that  n  1  and  717  are  related  as  in  Theorem  16.23.  If  l  is  a  half-integer  (i.e., 
if  the  dimension  of  V  is  even)  then  no  such  representation  exists. 

It  follows  from  this  result  and  Proposition  16.39  that  the  irreducible 
representations  of  the  group  S0(3)  are  precisely  the  n^’s  for  which  l  is  an 
integer. 

Proof.  If  l  is  a  half-integer,  then  L3  is  diagonal  in  the  basis  {vj},  with 
eigenvalues  being  half-integers.  Thus, 

e27T7rz(F3)  _  g2? riL3  _  _j 

(Here  the  “7 r”  in  front  of  7 p  is  the  number  7r  =  3.14  . . ..)  On  the  other  hand, 
by  a  simple  modification  of  Example  16.16,  we  can  see  that  the  matrix 
F3  G  so(3)  satisfies  e27rFs  =  /.  Thus,  if  a  corresponding  representation 
of  SO  (3)  existed,  we  would  have 

11/(7)  =  11/  (e2^3)  =  e2™1^  =  -7, 


which  is  a  contradiction. 
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If  l  is  an  integer,  we  make  use  of  the  isomorphism  </>  between  su(2) 
and  so(3)  described  in  the  proof  of  Example  16.32,  which  maps  the  ba¬ 
sis  {Ei,  E2,  E3}  of  su(2)  to  the  basis  {Fi,  F2,  F3}  of  so(3).  We  obtain  a 
representation  7 x[  of  su(2)  by  setting  nj (X )  =  7 q(0(X)).  Since  SU (2)  is  sim¬ 
ply  connected,  Theorem  16.30  tell  us  that  there  is  a  representation  IIJ  of 
SU(2)  related  to  7 x[  in  the  usual  way.  We  then  compute  that 

-/)  =  n 

since  the  eigenvalues  of  L3  are  integers. 

Now,  by  Example  16.34,  there  is  a  surjective  homomorphism  <f>  from 
SU(2)  onto  SO (3)  for  which  the  associated  Lie  algebra  homomorphism  is  </>, 
and  ker<f>  =  {/,  —I}.  Since  the  kernel  of  H[  contains  {/,  —  /},  the  map 
factors  through  SO (3),  giving  a  representation  Lb  of  SO (3)  such  that  II J  = 
LfioT.  By  Exercise  10  in  Chap.  16,  the  associated  Lie  algebra  representation 
(7 1  of  so(3)  satisfies  7 x[  =  cri  o  </>,  so  that  07  =  7 t[  o  </>_1  =  7 q.  Thus,  Ifi  is  the 
desired  representation  of  SO (3).  ■ 


/  _  g27T7rfi£li)  _  e27T7Tz(Fi)  _  g27riL3  _  j 


17.6  Realizing  the  Representations  Inside  L2(S 2) 

In  this  section,  we  deviate  from  the  traditional  treatment  in  the  physics  lit¬ 
erature  by  thinking  of  the  “spherical  harmonics”  as  restrictions  to  the  unit 
sphere  of  certain  polynomials  on  M3,  rather  than  describing  the  spherical 
harmonics  in  angular  coordinates  on  the  sphere.  Our  approach  avoids  some 
messy  computations  in  polar  coordinates  and  it  also  generalizes  readily  to 
higher  dimensions. 

Recall  from  Sect.  17.3  that  there  is  a  natural  unitary  representation  II  : 
SO(3)  -A  L2(M3)  given  by  II (R)ijj(x)  =  In  solving  rotationally 

invariant  problems  such  as  the  quantum  hydrogen  atom,  it  will  be  useful 
to  understand  the  structure  of  finite-dimensional  subspaces  V  of  L2(M3) 
such  that  V  is  invariant  under  II  and  such  that  the  restriction  of  II  to  V  is 
irreducible. 

If  we  write  functions  on  M3  in  polar  coordinates,  then  SO (3)  acts  only  on 
the  angle  variables.  Thus,  it  is  useful  to  consider  also  the  action  of  SO (3) 
on  L2(S 2),  given  by  the  same  formula  as  for  L2(M3),  namely 

(n(i?),0)(x)  =  ,0(i?_1x),  x  e  s2. 

In  computing  the  norm  for  L2(S 2),  we  use  the  surface  area  measure  on 
S2 ,  which  is  invariant  under  the  action  of  SO (3).  Once  we  have  found 
invariant  subspaces  inside  L2(S2),  it  is  a  simple  matter  to  produce  invariant 
subspaces  inside  L2(M3)  as  well,  as  we  will  see  in  the  next  section. 
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We  will  be  interested  in  this  section  in  harmonic  polynomials  on  M3,  that 
is,  polynomials  p  satisfying  A p  =  0,  where  A  is  the  Laplacian.  Since  we 
always  consider  representations  over  C,  we  allow  these  polynomials  to  have 
complex  coefficients. 

Definition  17.11  Let  l  be  a  non-negative  integer.  Define  a  subspace  Vi  of 
L2{S2)  by  setting  V\  equal  to  the  space  of  restrictions  to  S2  of  harmonic 
polynomials  on  M3  that  are  homogeneous  of  degree  l.  Then  Vi  is  called  the 
space  of  spherical  harmonics  of  degree  l. 

Note  that  if  p  is  a  homogeneous  polynomial  on  M3  of  some  degree  Z,  then 
the  restriction  of  p  to  S 2  is  identically  zero  only  if  p  itself  is  identically  zero. 
After  all,  if  p  is  homogeneous  of  degree  l  and  zero  on  S'2,  then 


for  all  x  /  0,  and  hence,  by  continuity,  for  all  x  E  M3.  (By  contrast,  the 
nonzero,  nonhomogeneous  polynomial  p(x)  :=  x\  Ex\  +  x2  —  1  is  identically 
zero  on  S2.)  We  are  therefore  free  to  shift  back  and  forth  between  thinking 
of  the  elements  of  Vi  as  functions  on  S2  or  as  functions  on  M3. 

It  is  well  known  that  the  Laplacian  A  commutes  with  rotations.  It  follows 
that  each  Vi  is  invariant  under  the  action  of  the  rotation  group.  We  will 
eventually  see  that  V\  is  irreducible  under  this  action. 

Every  homogeneous  polynomial  of  degree  0  or  1  is  harmonic.  Thus,  Vo 
consists  of  the  constant  functions  on  S2  and  V\  is  spanned  by  the  restric¬ 
tions  to  S2  of  the  functions  x\,  x 2,  and  x%.  Meanwhile,  the  space  of  homoge¬ 
neous  polynomials  of  degree  2  is  6-dimensional,  and  the  space  of  harmonic 
polynomials  that  are  homogeneous  of  degree  2  is  spanned  by  the  following 
five  polynomials:  X1X2 ,  £2^3,  #3#i,  and  x  2  x  (The  polynomial 

x\  —  x2  is  also  harmonic,  but  it  is  just  the  sum  x\  —  x^  and  x\  —  x2.) 

Theorem  17.12  The  spaces  Vi  have  the  following  properties. 


1.  Each  Vi  has  dimension  21  +  1. 

2.  Each  Vi  is  invariant  under  the  action  of  the  rotation  group  and 
irreducible  under  this  action. 

3.  For  l  m,  the  spaces  Vi  and  Vm  are  orthogonal  in  L2(S2). 

4-  The  Hilbert  space  L2{S2)  decomposes  as  the  orthogonal  direct  sum  of 
the  V\ ’s,  as  l  ranges  over  the  non-negative  integers. 


The  remainder  of  this  section  will  be  devoted  to  the  proof  of 
Theorem  17.12.  We  proceed  in  a  series  of  lemmas,  along  with  some  corol¬ 
laries  of  those  lemmas. 
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Lemma  17.13  LetV  denote  the  space  of  polynomials  on  M3  with  complex 
coefficients.  There  exists  an  inner  product  (•,  •)  on  V  with  the  property  that 

{p,Aq)v  =  (x2p,q)v  , 

where 

2  2,2,2 

nr*  -  ry»  I  ry»  I  ry» 

dr  —  Jb  2  “I  'T  2  I”  3  . 

Proof.  Although  it  is  possible  to  give  a  combinatorial  construction  of  the 
desired  inner  product,  we  can  also  give  an  analytic  construction.  Every 
polynomial  p  on  M3  certainly  has  a  holomorphic  extension  to  C3,  denoted 
Pc.  We  may  define,  then, 


hi2/ 2 


(p,q) v=  /  Pc(z)qc(z) 


d6  z , 


c3 


7f3/2 


which  is  nothing  but  the  inner  product  of  pc  and  qc  as  elements  of  the 
Segal-Bargmann  space  77L2(C3,/i i).  According  to  Lemma  14.12,  we  have 


rc3 


— —9qC/  |z|2/2  6 

K{x]aZ{z)~3 “ 


r  _  e-M2/2 

/  Zjpc(z)qc(z) — d6z 

J  c3  71-7 


for  all  p,q  G  P  and  all  jf  =  1,2,3.  This  relation  means  that 


/ 

V  ’ 

from  which  we  readily  obtain  the  desired  property  of  our  inner  product.  ■ 
A  standard  bit  of  elementary  combinatorics  shows  that  the  number  of 
ordered  triples  (Zi,  Z2, 13)  with  l\  +  Z2  +  I3  =  l  is  equal  to  (Z  +  2)(Z  +  l)/2. 
Since  the  monomials  X  -j^  00  2  00  33  with  Zi  + 12  +  Z3  =  Z  form  a  basis  for  Vi,  we 
have  dimP^  =  (Z  +  2)(Z  +  l)/2. 

Corollary  17.14  If  Vi  denotes  the  space  of  polynomials  on  M3  that  are 
homogeneous  of  degree  Z,  then  the  Laplacian  A  maps  Vi  onto  Vi- 2  for  all 
l  >  2.  Thus,  for  all  l  >  2,  we  /mve 


dim  V/  =  dim  Vj  —  dim  Vi- 2 

(Z  +  2)(Z  +  1)  Z(Z  —  1) 
_  2  2 
=  2Z  +  1. 


Proof.  Let  us  equip  the  finite-dimensional  spaces  P/  and  Vi- 2  with  the 
inner  product  from  Lemma  17.13.  It  is  easy  to  see  that  the  statement, 
“The  orthogonal  complement  of  the  image  is  the  kernel  of  the  adjoint,” 
applies  to  linear  maps  of  one  finite-dimensional  inner  product  space  to 
another.  Applying  this  to  A  :  Vi  -A-  Vi- 2,  we  note  that  the  adjoint  of  A  is 
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multiplication  by  x2,  which  is  clearly  injective,  since  x\  +  x\  +  x\  is  zero 
only  at  the  origin.  Thus,  the  orthogonal  complement  of  the  image  of  A  is 
{0}.  Since  the  spaces  are  finite-dimensional,  this  means  that  A  maps  Vi 
onto  Vi-2-  ■ 


Corollary  17.15  Let  l  be  a  non-negative  integer  and  let  k  =  1/2  if  l  is 
even  and  let  k  =  (/  —  l)/2  if  l  is  odd.  Then  each  p  £  Vi  can  be  decomposed 
in  the  form 


2k 


p(x)  =  p o(x)  +  | x |  Pi(x)  +  | x |  P2  (x)  H - h  |x|  Pfe(x), 


where  each  pj  (x)  is  a  harmonic  polynomial  that  is  homogeneous  of  degree 
l  —  2 j.  In  particular,  the  restriction  of  p  to  S 2  satisfies 


p 


s 2 


(po  +  Pi  +  '  •  '  +  Pk)  I52  , 


where  Po  +  Pi  +  *  •  •  +  Pk  is  a  (nonhomogeneous)  harmonic  polynomial. 


Given  any  polynomial  p,  not  necessarily  homogeneous,  we  can  apply 
Corollary  17.15  to  each  homogeneous  piece  of  p.  We  see,  then,  that  given 
any  polynomial  p,  there  exists  a  harmonic  polynomial  p  such  that  p  and  p 
have  the  same  restriction  to  S 2. 

Proof.  We  proceed  by  induction  on  L  If  l  =  0  or  l  =  1,  then  all  p  E  Vi 
are  harmonic  and  the  desired  decomposition  is  simply  p  =  po.  Consider, 
then,  some  l  >  2  and  assume  the  result  holds  for  all  degrees  less  than  l. 
Lemma  17.13  tells  us  that  Vi  decomposes  as  an  orthogonal  direct  sum  of 
the  kernel  of  A  and  the  image  of  Vi- 2  under  multiplication  by  |x| 2  .  Thus, 
any  p  E  Vi  can  be  decomposed  as  p  =  po  -j-  |x | 2  go,  where  po  is  harmonic 
and  <70  belongs  to  Vi- 2-  By  induction,  q 0  has  a  decomposition  of  the  desired 


x 


<7o  gives 


form;  substituting  this  in  for  go  in  the  decomposition  p  =  po  + 
the  desired  decomposition  of  p.  ■ 

To  show  that  Vi  is  irreducible  under  the  action  II  of  SO (3),  we  pass  to 
the  Lie  algebra.  Since,  as  we  have  remarked,  restriction  to  the  sphere  is 
injective  on  homogeneous  polynomials,  we  may  think  of  the  elements  of  V3 
as  polynomials  on  M3,  in  which  case,  the  Lie  algebra  action  it  associated 
with  II  is  given  in  terms  of  the  usual  angular  momentum  operators. 

Lemma  17.16  As  in  Theorem  17. 4,  let  L3  =  iir^F^)  =  J3  and  let  = 
in(Fi)  —  tt{F2)  =  J\  +  iJ2 •  For  any  non-negative  integer  Z,  the  polynomial 
p(x  1,^2,  ^3)  •=  (%i  Fix 2)1  belongs  to  V\  and  satisfies 


L3p  =  Ip 


and 


L+p  =  0. 
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Proof.  Since  it  is  independent  of  x3  and  holomorphic  as  a  function  of 
z  :=  x  1  +  ix 2,  the  polynomial  p  is  automatically  harmonic,  which  can  also 
be  verified  by  direct  calculation.  Meanwhile,  applying  L3  to  p  gives 


.(  d  d  \ 

-  1  Xi- - X2  — 

y  OX  2  OX  1  J 

=  —i  [xil(x\  +  ix 2)l~l 

=  l(x  1  +  ix2)z. 


(oq  +  ix  2)1 
(i)  -  x2l(x  1  + 


Finally,  applying  L+  :=  777  (ib)  —  tt(F2)  to  p  gives 


/  <9  9  \  (  d  d  \ 

~  I  X2~ - X3~ -  P+  X3~ - X!- -  p 

\  ox  3  ox  2  y  y  t/Xi  ^^3  / 

=  -i(-x3l(xi  F  ix^1-1  {i))  +x3l(x\  -\rix2)l~1(  1) 

=  0, 


as  claimed.  ■ 

Corollary  17.17  The  space  Vi  is  irreducible  under  the  action  of  SO (3). 

Proof.  By  Proposition  17.9,  if  we  apply  L~  repeatedly  to  the  polynomial 
p,  we  obtain  a  “chain”  of  eigenvectors  of  length  21  +  1.  These  eigenvectors 
span  an  irreducible  invariant  subspace  of  dimension  21  -f  1.  Since  we  have 
already  established  that  dim  Vi  =  21  +  1,  the  elements  of  the  chain  must 
span  Vi,  which  implies  that  Vi  is  irreducible.  ■ 

We  have  now  assembled  all  the  pieces  necessary  for  a  proof  of  the  main 
result  of  this  section. 

Proof  of  Theorem  17.12.  We  have  already  proved  Points  1  and  2  of  the 
theorem  in  Corollaries  17.14  and  17.17,  respectively.  Now,  each  Vi  is  an 
irreducible  representation  of  SO (3),  and  no  two  of  the  Vfs  can  be  isomor¬ 
phic,  because  they  all  have  different  dimensions.  Thus,  by  Exercise  19  in 
Chap.  16,  Vi  and  Vm  must  be  orthogonal  inside  L2(S2)  for  l  ^  m,  which  is 
Point  3. 

Finally,  by  the  Stone- Weierstr ass  theorem  and  the  density  results  of 
Theorem  A.  10,  the  restrictions  to  S 2  of  polynomials  on  M3  form  a  dense 
subspace  of  L2(S2).  But  Corollary  17.15  shows  that  the  space  of  restric¬ 
tions  to  S2  of  polynomials  coincides  with  the  space  of  restrictions  to  S2 
of  harmonic  polynomials.  Thus,  the  span  of  the  V^’s  is  dense  in  L2(S'2), 
establishing  Point  4.  ■ 
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Recall  that  for  homogeneous  polynomials  on  M3,  the  restriction  map  from 
M3  to  S2  is  injective.  Thus,  we  may  think  of  the  space  Vi  equally  well  as 
a  space  of  functions  on  S2  (as  in  the  previous  section)  or  as  a  space  of 
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functions  on  M3.  In  this  section,  then,  we  will  let  V\  denote  the  space  of 
harmonic  polynomials  on  M3  that  are  homogeneous  of  degree  l. 


Definition  17.18  Suppose  l  is  a  non-negative  integer  and  f  is  a  measur¬ 
able  function  on  (0,  oo)  such  that 


2  r2l+2 


dr  <  oo. 


Let  Vij  C  L2(M3)  denote  the  space  of  functions  if  of  the  form 


(17.14) 


V’(x)  =p(x)/(|x|), 


(17.15) 


where  p  E  Vi. 

The  condition  on  /(r)  is  precisely  what  one  needs  to  make  ^(x)  a  square- 
integrable  function  on  M3  (compute  the  L2  norm  in  spherical  coordinates). 

Definition  17.18  is  not  the  one  that  physicists  typically  use.  In  the  physics 
literature,  one  sees  a  functions  of  the  form 

V>(x)  —  Um  ( 0,<P)g(r ),  (17.16) 

where  r,  0,  and  are  the  usual  spherical  coordinates.  Here  Yim  is  the  re¬ 
striction  to  the  sphere  of  a  particular  harmonic  polynomial  that  is  homoge¬ 
neous  of  degree  /,  written  in  spherical  coordinates.  (Up  to  a  normalization 
factor,  the  Y/m5 s  are  obtained  by  using  the  basis  for  Vi  in  Theorem  17.4.) 
Thus,  if  we  move  along  a  ray  from  the  origin  in  M3,  only  the  value  of  g{r) 
changes.  By  contrast,  in  (17.15),  as  we  move  along  a  ray,  the  p(x)  factor 
contributes  a  factor  of  rl .  We  can  write  the  physics  expression  in  rectangular 
coordinates  as 


VKX)  =  Y]rn 


(17.17) 


For  computational  purposes,  the  expression  (17.15)  is  more  convenient 
than  (17.17);  in  fact,  in  the  analysis  of  the  hydrogen  atom,  physicists  mul¬ 
tiply  by  rl  at  some  later  point  in  the  calculation,  just  so  that  the  relevant 
differential  equation  will  take  on  a  simpler  form. 

Proposition  17.19  Every  space  of  the  form  V\j  C  L2(M3)  is  invari¬ 
ant  and  irreducible  under  the  action  of  SO (3).  Conversely,  every  finite¬ 
dimensional,  irreducible,  SO (3) -invariant  subspace  o/L2(M3)  is  of  the  form 
Vij  for  some  non-negative  integer  l  and  some  f  satisfying  (17. If). 
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Proof.  Since  the  factor  /(|x|)  is  invariant  under  rotations,  the  action  of 
SO (3)  only  affects  the  function  p.  Thus,  Vij  is  isomorphic,  as  a  represen¬ 
tation  of  S0(3),  to  the  space  Vi,  which  is  irreducible  by  Theorem  17.12. 

For  the  other  direction,  the  Lebesgue  measure  on  M3  decomposes  as  a 
product  of  the  surface  area  measure  on  S 2  with  the  measure  47rr2  dr  on 
(0,  oo).  Thus,  by  a  standard  measure-theoretic  result  (Proposition  19.12), 
L2(M3)  decomposes  canonically  as  the  Hilbert  tensor  product  of  L2(S2) 
and  L2((0,  oo)),  where  a  vector  of  the  form  f  ®g  in  the  tensor  product  cor¬ 
responds  to  the  function  f(6,(j))g{r )  in  L2(M3),  as  in  (17.16).  Since  L2(S 2) 
decomposes  (Theorem  17.12)  as  the  sum  of  the  spaces  VJ,  l  =  0, 1,  2, ... , 
we  can  decompose  L2(M3)  as  sum  of  spaces  of  the  form 


Vl,k  •=  Vl  ®  9k, 

where  the  s  form  an  orthonormal  basis  for  L2((0,  oo)). 

Now,  let  V  be  any  finite-dimensional,  irreducible,  SO(3)-invariant 
subspace  of  L  2(M3).  Let  7 iy/c  :  L2(M3)  Vig c  be  the  orthogonal  projec¬ 

tion  operator,  and  let  pi ^  be  the  restriction  of  717^  to  V.  This  map  is  easily 
seen  to  be  an  intertwining  map  for  the  action  of  SO (3).  Thus,  since  both  V 
and  Vpk  are  irreducible,  Schur’s  lemma  tells  us  that  each  pi^  is  either  zero 
or  an  isomorphism.  Furthermore,  since  the  spaces  Vpk  are  nonisomorphic 
for  different  values  of  /,  we  cannot  have  both  p^i  and  pkpv  being  nonzero 
for  l  7^  V .  On  the  other  hand,  pk,i  cannot  be  zero  for  all  k  and  /,  since  the 
Vk/s  span  L2(M3).  Thus,  there  must  be  some  value  Iq  of  l  such  that  pi0ik0 
is  nonzero  for  some  ko  but  such  that  ppk  =  0  for  all  1^1$. 

Applying  Schur’s  lemma  again,  we  see  that  pi0jk(pi0,k0)~1  must  be  of  the 
form  c^I  for  each  k.  Given  any  ^  E  V,  let  v  be  the  unique  element  of  V 
such  that  pi0jk0( ^)  =  v  ®  gk0  •  Then  we  have 

Pio.fcWO  =  ck(v  G  gk) 

for  every  k.  Since  also  ppki'ip)  =  0  for  l  ^  Iq,  we  conclude  that  ip  must  be 
of  the  form  v  (g)  g,  where 

d  =  T  °k9k- 

k 

Since  this  holds  for  each  pj  E  V  (with  the  same  set  of  constants  c*;),  we  see 
that  V  =  G0  G  g,  which  is  nothing  but  the  form  in  (17.16).  Then  V  is  of 
the  form  claimed  in  the  proposition,  where  /(r)  =  g(r)/rl°.  m 

It  can  further  be  shown  that  each  closed,  SO(3)-invariant  subspace  of 
L2(M3)  decomposes  as  an  orthogonal  direct  sum  of  finite-dimensional,  ir¬ 
reducible,  SO(3)-invariant  subspaces.  This  result  is  just  a  special  case  of  a 
general  result  for  strongly  continuous  unitary  representations  of  compact 
topological  groups.  (See,  e.g.,  Chap.  5  of  [10].)  Since  we  already  know  that 
L2(M3)  is  a  direct  sum  of  finite-dimensional,  irreducible  invariant  subspaces, 
it  is  probably  possible  to  give  an  elementary  proof  of  this  result,  but  we 
will  not  pursue  that  approach  here. 


17.8  Spin 


383 


17.8  Spin 


We  classified  irreducible  finite-dimensional  representations  of  the  Lie 
algebra  so(3)  by  their  “spin”  /,  where  l  is  the  largest  eigenvalue  for  the 
operator  L3  =  in  (F3).  The  possible  values  for  l  are  non- negative  integers 
(0, 1,2,.. .)  and  the  positive  half-integers  (1/2,  3/2, . . .).  Inside  L2(S 2)  and 
L2(M3),  however,  we  found  only  irreducible  representations  of  so(3)  with 
integer  spin.  It  is  easy  to  understand  why  the  half-integer  spin  represen¬ 
tations  do  not  occur:  They  do  not  correspond  to  any  representation  of  the 
group  SO (3).  Since  L2(S2)  and  L2(M3)  both  carry  a  natural  unitary  action 
II  of  the  group  SO(3),  any  finite-dimensional  subspace  that  is  invariant  un¬ 
der  the  associated  Lie  algebra  representation  7 r  will  also  be  invariant  under 
II  and  thus  constitute  a  representation  of  SO (3). 

Although  the  half-integer  representations  7 q  of  the  Lie  algebra  so (3)  can¬ 
not  be  exponentiated  to  representations  of  SO (3),  they  can  be  exponenti¬ 
ated  to  representations  of  the  universal  cover  SU(2)  of  SO (3),  as  in  the  proof 
of  Proposition  17.10.  For  a  half-integer  /,  the  associated  representation  LlJ  of 
SU(2)  satisfies  IIJ(— I)  =  — which  means  that  II{  does  not  factor  through 
SO(3)  =  SU(2)/{7,  —  I}.  If,  however,  we  think  about  projective  representa¬ 
tions,  we  see  that  [— I]  is  the  identity  element  in  PU(1F).  Thus,  even  when  l 
is  a  half-integer,  we  get  a  well-defined  projective  representation  Ifi  of  SO (3) 
that  satisfies 


\et7rim] 


for  all  X  G  so(3),  where  [U]  denotes  the  image  of  U  G  U ( W)  in  PU(V’). 

It  is  generally  believed  that  the  physics  of  the  universe  is  invariant  under 
the  rotation  group  SO (3).  This  does  not  mean  that  one  never  considers 
models  without  rotational  symmetry,  because  the  local  environment  of, 
say,  a  hydrogen  atom  in  a  magnetic  field  breaks  the  rotational  symmetry  of 
the  hydrogen  atom.  Nevertheless,  if  we  were  to  rotation  both  the  hydrogen 
atom  and  the  magnetic  field,  the  physics  of  the  problem  would  not  change. 
In  quantum  mechanics,  rotational  symmetry  means  that  there  should  be 
a  projective  unitary  representation  of  SO (3)  on  the  Hilbert  space  of  the 
universe  that  commutes  with  the  Hamiltonian  operator.  Now,  the  Hilbert 
space  of  the  universe  (if  there  is  such  a  thing)  is  built  up  out  of  Hilbert 
spaces  for  each  type  of  particle.  Thus,  we  expect  that  the  Hilbert  space 
for  a  single  particle  will  also  carry  a  projective  unitary  representation  of 
SO(3). 

The  simplest  possibility  for  the  Hilbert  space  of  a  single  particle  is  the 
Hilbert  space  L2(M3),  which  certainly  carries  an  (ordinary)  unitary  action 
of  SO (3),  as  we  have  been  discussing  in  this  chapter.  Based  on  various  ex¬ 
perimental  observations,  however,  physicists  have  proposed  a  modification 
to  the  Hilbert  space  for  an  individual  particle  that  incorporates  “inter¬ 
nal  degrees  of  freedom.”  The  proposal  is  that  for  each  type  of  particle, 
the  quantum  Hilbert  space  should  be  of  the  form  L  2(M3)(g)I/,  where  V 
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is  a  finite-dimensional  Hilbert  space  that  carries  an  irreducible  projective 
unitary  representation  of  SO (3).  Here  (g)  is  the  Hilbert  tensor  product  (Ap¬ 
pendix  A. 4. 5).  The  (projective)  action  of  S0(3)  on  V  describes  the  action 
of  the  rotation  group  on  the  internal  degrees  of  freedom  of  the  particle. 

Now,  according  to  Proposition  16.46,  the  space  V  carries  a  (trace-zero) 
ordinary  representation  tt  of  the  Lie  algebra  so(3).  In  customary  physics 
terminology,  the  largest  eigenvalue  l  of  the  operator  L3  :=  m{Fof)  in  V  is 
then  called  the  spin  of  the  particle.  We  then  denote  the  space  V  by  Vi  to 
indicate  the  value  of  the  spin.  Electrons,  for  example,  are  “spin  1/2”  par¬ 
ticles,  meaning  that  the  Hilbert  space  for  a  single  electron  is  L  2(M3)®U/2, 
where  Vi/2  is  a  two-dimensional  projective  representation  of  SO (3). 

It  is  easy  to  see  that  the  tensor  product  of  two  projective  unitary  repre¬ 
sentations  of  a  given  group  is  again  a  projective  unitary  representation  of 
that  group.  (By  contrast,  the  direct  sum  of  two  projective  unitary  repre¬ 
sentations  is  in  general  not  again  a  projective  unitary  representation.)  In 
the  case  at  hand,  we  can  think  of  L2(M3)  as  carrying  a  unitary  representa¬ 
tion  n  of  SU(2)  that  factors  through  S0(3),  that  is,  for  which  n(— 7)  =  7. 
Meanwhile,  we  can  think  of  Vi  as  a  carrying  a  unitary  representation 
of  SU{2)  in  which  n i(—I)  =  ±7,  with  the  plus  sign  if  l  is  an  integer  and 
the  minus  sign  if  l  is  a  half-integer.  Thus,  L2(M3)(g)U  carries  a  unitary  rep¬ 
resentation  n  (g)  of  SU(2)  in  which  (n  (g)  n /)(—/)  =  ±7.  Thus,  in  the 
projective  sense,  n  (g)  factors  through  SO (3). 

Summary  17.20  (Spin)  Each  type  of  particle  has  a  “spin”  Z,  which  is  a 
non-negative  integer  or  half-integer.  The  Hilbert  space  for  such  a  particle 
is  L2(M3)(g)M,  where  Vi  is  an  irreducible  projective  representation  of  SO  (3) 
of  dimension  2Z  +  1. 

Since  Vi  is  finite  dimensional,  the  Hilbert  tensor  product  L  2(M3)(g)M  co¬ 
incides  with  the  algebraic  tensor  product  of  L  2(M3)  with  Vi. 

Definition  17.21  A  particle  for  which  the  spin  is  an  integer  is  called  a  bo¬ 
son,  and  a  particle  for  which  the  spin  is  a  half-integer  is  called  a  fermion. 

To  see  the  significance  of  the  distinction  between  integer  and  half-integer 
spin,  one  needs  to  look  at  the  structure  of  the  Hilbert  space  describing 
multiple  particles  of  a  given  type,  such  as  the  Hilbert  space  for  five  electrons. 
This  topic  is  discussed  in  Chap.  19. 


17.9  Tensor  Products  of  Representations: 

“Addition  of  Angular  Momentum” 

Let  Vi  and  Um  be  irreducible  representations  of  so(3)  with  dimensions  2Z  +  1 
and  2m  +  1,  respectively.  As  discussed  in  Sect.  16.8,  the  tensor  product 
space  Vi  (g)  Vm  can  be  viewed  as  another  representation  of  so(3).  Unless 
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one  of  l  and  m  is  zero,  Vi  0  Vm  is  not  irreducible.  It  is  of  interest,  then, 
to  decompose  Vi  0  Vm  as  a  direct  sum  of  irreducible  invariant  subspaces. 
This  decomposition — in  the  case  that  Vi  is  an  irreducible  S0(3)-invariant 
subspace  of  L2(M3)  and  Vm  is  the  space  of  internal  degrees  of  freedom  of  a 
particle — will  help  us  in  decomposing  the  Hilbert  space  for  a  particle  with 
spin  into  irreducible,  S0(3)-invariant  subspaces. 

Proposition  17.22  Let  V\/2  be  an  irreducible  representation  of  so(3)  of 
dimension  2,  and  let  V\  be  an  irreducible  representation  of  so(3)  of  dimen¬ 
sion  21  +  1,  where  l  is  a  non-negative  integer  or  half-integer.  If  l  =  0, 
Vi  0  V1/2  is  irreducible.  If  l  >  0,  then  we  have 


Vi  0  Vi/2  —  Vl+l/2  ©  Vi- 1/2, 

where  “= ”  denotes  an  isomorphism  of  representations. 

Proof.  If  l  =  0,  then  it  is  easy  to  see  that  V\  0  Vi /2  is  isomorphic  to  Vi/2, 
which  is  irreducible.  Assume,  then,  that  l  >  0. 

Let  L+,  L-,  and  L3  be  the  operators  in  Theorem  17.4,  constructed  using 
the  representation  717,  and  let  cr+,  <r-,  and  <73  be  the  analogous  operators 
constructed  using  the  representation  7Ti/2.  As  in  Sect.  16.8,  we  define  oper¬ 
ators  J+,  J-,  and  J3  on  Vi  0  Vi/2  by 

J+  =  L+  0  I  +  I  0  cr+ 

J ~  =  L~  0  I  +  I  0  a~  (17.18) 

J3  =  Z/3  0  /  +  I  0  <73. 

Let  {^o, . .  • ,  V21}  be  a  basis  for  Vi  as  in  Theorem  17.4,  and  let  {eo,  ^1}  be 
a  similar  basis  for  V\i2.  Then  the  vectors  of  the  form  Vj  0  e k  form  a  basis 
for  Vi  0  Vi/2.  The  eigenvalues  of  J3  are  the  numbers  of  the  form 

+ (P  *)  ■ 

j  =  0, 1, . . . ,  2 /,  k  =  0,1.  Thus,  the  eigenvalues  of  J3  range  from  l  +  1/2  to 
—  (l  +  1/2).  The  numbers  l  +  1/2  and  —(l  +  1/2)  occur  as  eigenvalues  only 
once.  All  other  eigenvalues  A  occur  twice,  once  as  (A  —  1/2)  +  1/2  and  once 
as  (A +  1/2)  -  1/2. 

The  vector  vq  0  eo  is  an  eigenvector  for  J3  with  the  largest  possible 
eigenvalue  l  +  1/2,  so  that  J+(+)  0eo)  =  0.  According  to  Proposition  17.9, 
if  we  apply  J~  repeatedly,  we  will  obtain  a  “chain”  of  eigenvectors  of  length 
21  +  2,  and  the  span  of  these  vectors  forms  an  irreducible  invariant  subspace 
Ho  isomorphic  to  V/+ 1/2. 

Now,  by  Proposition  17.7,  there  exist  inner  products  on  Vi  and  V\i2 
that  make  tti  and  7 Ti/2  “unitary,”  meaning  that  7 r(X)*  =  —7 r(X)  for  all 
X  G  so(3).  If  we  use  on  VJ  0V \/2  the  natural  inner  product,  obtained  from 
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the  inner  products  on  Vi  and  V\/2  as  in  Appendix  A. 4. 5,  then  7 r/  0  ^1/2  is 
also  unitary.  Thus,  the  orthogonal  complement  of  the  invariant  subspace 
Wo  is  also  invariant.  Since  all  eigenvalues  for  J3  except  the  largest  and 
smallest  have  multiplicity  2,  we  see  that  the  largest  eigenvalue  for  J3  in 
W^~  is  l  —  1/2.  Let  Wo  G  W^~  be  an  eigenvector  for  J3  with  eigenvalue 
l  —  1/2.  If  we  repeatedly  apply  the  lowering  operator  J~  :  L~  <g>  I  + 1  ®  cr~ 
to  rco,  we  will  obtain  a  chain  of  eigenvectors  of  length  21.  These  eigenvectors 
span  an  irreducible  invariant  subspace  W\  of  V/0Li/2  of  dimension  21.  Since 

dim  Wo  +  dim  W\  =  4/  +  2  =  dim  (Vi  <S>  V1/2), 

we  must  have  Wi  =  Wq-,  completing  the  proof.  ■ 

Since  an  electron  is  a  “spin  1/2”  particle,  the  Hilbert  space  for  a  single 
electron  is,  according  to  Sect.  17.8,  L2(M3)(g)V i/2,  where  VW  is  an  irre¬ 
ducible  projective  unitary  representation  of  SO (3)  of  dimension  2.  Mean¬ 
while,  in  Sect.  17.7,  we  saw  how  to  find  irreducible,  S0(3)-invariant  sub¬ 
spaces  Vij  of  L  2(M3)  of  dimension  21  +  1,  for  l  =  0, 1,  2, ... ,  where  /  is 
an  arbitrary  radial  function.  By  applying  Proposition  17.22  to  the  case 
Vi  =  Vij ,  we  obtain  irreducible  SO(3)-invariant  subspaces  of  the  Hilbert 
space  L2(M3)(g)Vi/2-  Finding  such  subspaces  is  essential  in,  for  example, 
analyzing  the  fine  structure  of  the  hydrogen  atom. 

In  the  case  that  Vi  is  an  SO(3)-invariant  subspace  of  L  2(M3),  the  for¬ 
mula  for,  say,  the  operator  J3  in  (17.18)  17.22  is  written  in  the  physics 
literature  as 

J3  =  F3  +  (J3,  (17.19) 

where  it  is  understood  that  L3  acts  on  the  first  factor  in  the  tensor  prod¬ 
uct  and  <7 3  acts  on  the  second  factor.  (That  is  to  say,  the  tensor  product 
with  the  identity  operator  is  understood  and  thus  not  written.)  Here  L3  is 
the  ordinary  angular  momentum  operator  and  <73  describes  the  action  of 
the  basis  element  F3  G  so (3)  on  the  space  VW.  Formulas  such  as  (17.19) 
account  for  the  physics  terminology  “addition  of  angular  momentum”  to 
describe  the  analysis  of  tensor  products  of  representations  of  so(3).  In  this 
context,  the  operator  L3  (=  L3G)/)  is  called  an  orbital  angular  momentum 
operator,  and  the  operator  <73  (=  I®c 73)  is  called  a  spin  angular  momentum 
operator,  and  similarly  for  and  a±. 

We  now  record  the  general  result  for  tensor  products  of  irreducible  rep¬ 
resentations  of  so(3). 

Proposition  17.23  For  any  j  =  0, 1/2,1,...,  let  V:)  denote  the  unique 
irreducible  representation  of  so(3)  of  dimension  2j  +  1.  Then  for  any  l  and 
m  with  l  >  m,  we  have 

Vi  <S)  Vm  =  Vi+m  ®  VJ+m-1  ®  *  *  *  ®  Vi-m+ 1  0  Fi_m.  (17.20) 

The  proof  of  this  result  is  similar  to  that  of  Proposition  17.22,  and  is 
omitted;  see  Theorem  D.l  in  Appendix  D  of  [21].  An  important  property 
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of  this  decomposition  is  that  each  irreducible  representation  that  occurs 
on  the  right-hand  side  of  (17.20)  occurs  only  once.  This  property  of  the 
representations  of  so (3)  is  the  key  idea  in  the  proof  of  the  Wigner-Eckart 
theorem.  See  Appendix  D  of  [21]  for  details. 


17.10  Vectors  and  Vector  Operators 

Definition  17.24  A  function  c  :  M3  x  M3  o  M3  is  said  to  transform  like 
a  vector  if 

c(i7x,  Rp)  =  R( c(x,p))  (17.21) 

for  all  R  E  SO(3). 

In  the  physics  literature,  the  expression  “is  a  vector”  is  sometimes  used 
in  place  of  “transforms  like  a  vector.” 

Note  that  in  Definition  17.24,  we  only  consider  the  transformation  prop¬ 
erty  of  c  under  elements  of  SO(3)  rather  than  under  a  general  element  of 
0(3).  If  c  transforms  like  a  vector,  one  says  that  c  is  an  “true  vector”  if  c 
satisfies  (17.21)  for  all  R  in  0(3)  [not  just  in  S0(3)]  and  one  says  that  c  is  a 
“pseudovector”  if  c  satisfies  c(i7x,  Rp)  =  —R( c(x,  p))  for  R  E  0(3)\S0(3). 
For  our  purposes,  it  is  not  necessary  to  distinguish  between  true  vectors 
and  pseudovectors. 

The  position  function  ci(x,  p)  :=  x,  the  momentum  function  C2(x,  p)  := 
p,  and  the  angular  momentum  function  C3(x,  p)  :=  x  x  p  are  simple  exam¬ 
ples  of  functions  that  transform  like  vectors.  (Transformation  under  rota¬ 
tions  is  one  of  the  standard  properties  of  the  cross  product.)  A  typical  ex¬ 
ample  of  a  function  transforming  like  a  vector  is  c(x,  p)  =  (x-p)  |x|  (x  x  p). 

Proposition  17.25  Let  j(x,  p)  =  x  x  p  denote  the  angular  momentum 
function  on  M3  x  M3-  Suppose  a  smooth  function  c  :  M3  x  M3  — x  M3  trans¬ 
forms  like  a  vector.  Then  we  have 


{ckJk}  =  0  (17.22) 

for  k  =  1,2,3.  Furthermore,  we  have 

{ci,  32}  =  {jij  c2}  =  c3  (17.23) 

and  other  relations  obtained  from  (17.23)  by  cyclically  permuting  the 
indices. 

Proof.  Let  R(0)  denote  a  counterclockwise  rotation  by  angle  6  in  the 
(#i,  X2)-plane.  Applying  (17.21)  with  R  =  R(6)  and  looking  only  at  the 
first  component  of  the  vectors,  we  have 

ci(U(0)x,  R(0)p)  =  ci(x,  p)  cos 0  —  C2(x,  p)  sin#.  (17.24) 
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Now,  as  in  the  proof  of  Proposition  2.30,  the  Poisson  bracket  {ci ,  j'3 }  is 
precisely  the  derivative  of  the  left-hand  side  of  (17.24)  with  respect  to  0, 
evaluated  at  6  =  0.  Thus, 


{C1J3}  =  -c2 

and  so  {j'3,  c\}  =  C2,  which  is  one  of  the  relations  obtained  from  (17.23)  by 
cyclically  permuting  the  indices. 

Meanwhile,  if  we  again  apply  (17.21)  with  R  =  R(6)  but  look  now  at  the 
third  component  of  the  vectors,  we  have  that 

c3(R(0)x,  R(0) p)  =  c3(x,  p). 

Differentiating  this  relation  with  respect  to  0  at  0  =  0  gives  {c3,j3}  =  0. 
All  other  brackets  are  computed  similarly.  ■ 

We  now  turn  to  the  quantum  counterpart  of  a  function  that  transforms 
like  a  vector. 

Definition  17.26  For  any  ordered  triple  C  :=  (Ci,  62,63)  of  operators 
on  L2(M3)  and  any  vector  vgR3,  let  v  •  C  be  the  operator 

3 

v-C  =  Y,vjCj.  (17.25) 

3  = 1 

Then  an  ordered  triple  C  of  operators  on  L2(M3)  is  called  a  vector  oper¬ 
ator  if 

(Rv)  •  C  =  n(i?)(v  •  C)n(i7)“1  (17.26) 

for  all  R  E  SO(3). 

Here  n(-)  is  the  natural  unitary  action  of  SO(3)  on  L2(M3)  in  Defini¬ 
tion  17.1.  Let  us  try  to  understand  what  this  definition  is  saying  in  the 
case  of,  say,  the  angular  momentum,  which  is  (as  we  shall  see)  a  vector  op- 

A  A  A  A 

erator.  The  operators  Ji,  J2,  and  J3  represent  the  components  of  J  in  the 
directions  of  ei,  e2,  and  e3,  respectively.  More  generally,  we  can  consider 

A 

the  component  of  J  in  the  direction  of  any  unit  vector  v,  which  will  be 
nothing  but  v-  J,  as  defined  in  (17.25).  Since  there  is  no  preferred  direction 
in  space,  we  expect  that  for  any  two  unit  vectors  vi  and  V2,  the  operators 

A  A 

vi  •  J  and  V2  •  J  should  be  “the  same  operator,  up  to  rotation.”  Specifically, 

A  A 

if  R  is  some  rotation  with  Rv  1  =  V2,  then  vi  •  J  and  V2  •  J  should  differ 
only  by  the  action  of  R  on  the  Hilbert  space  L2(M3).  But  this  is  precisely 

A 

what  (17.26)  says,  with  v  =  vi  and  C  =  J: 

v2  J  =  n(i?)(vi  •  J)n(i7)“1 

We  will  not  concern  ourselves  with  the  question  of  whether  (17.26) 
continues  to  hold  for  R  E  0(3)\S0(3).  The  position  and  momentum  opera¬ 
tors  X  and  P  are  easily  seen  to  be  vector  operators.  As  in  the  classical  case, 
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the  cross  product  of  two  vector  operators  is  again  a  vector  operator.  (See 

A 

Exercise  7  in  Chap.  18.)  In  particular,  the  angular  momentum,  J  =  X  x  P 
is  a  vector  operator. 

If  the  operators  Ci,  C2,  and  C3  are  unbounded,  we  should  say  something 
in  Definition  17.26  about  the  domains  of  the  operators  in  question.  The  sim¬ 
plest  approach  is  to  find  some  dense  subspace  V  of  L2(M3)  that  is  contained 
in  the  domain  of  each  Cj  and  such  that  V  is  invariant  under  rotations.  In 
that  case,  the  equality  in  (17.26)  is  understood  to  hold  when  applied  to  a 
vector  in  V.  In  many  cases,  we  can  take  V  to  be  the  Schwartz  space  <S(M3). 
In  the  following  proposition,  the  space  V  should  satisfy  certain  technical 
domain  conditions  that  permit  differentiation  of  (17.29)  when  applied  to  a 
vector  if  in  V.  We  will  not  pursue  the  details  of  such  conditions  here. 


Proposition  17.27  If  C  is  a  vector  operator ,  then  the  components  of  C 
satisfy 


1 

ih 


(17.27) 


for  j  =  1,2,3.  Furthermore,  we  have 


(17.28) 


and  other  relations  obtained  from  (17.28)  by  cyclically  permuting  the 
indices. 

Proof.  As  in  the  proof  of  Proposition  17.25,  R(0)  denote  a  rotation  in  the 
(xi,x2)-plane,  and  let  ei  =  (1,0,0).  Applying  (17.26)  with  R  =  R(9)  and 
v  =  ei,  we  have 

n(i7((9))Cin(i7(<9))-1  =  Cl  cos6>  +  C2  sin 0.  (17.29) 


But  R(0)  =  e6Fs ,  where  {Fj}  is  the  basis  for  so(3)  described  in  Sect.  16.5. 
Thus,  differentiating  (17.29)  with  respect  to  0  at  6  =  0  gives 


n(Fs)Ci  —  Citt(Fs)  =  C2. 


Since  J3  =  ilm{F^)  (Proposition  17.3),  we  obtain  (l/(ih))[Js,  Ci]  =  C2, 
which  is  one  of  the  relations  obtained  from  (17.28)  by  cyclically  permuting 
the  variables. 

Meanwhile,  applying  (17.26)  with  R  =  R(0)  and  v  =  e3  gives 

n(i?(0)C'3n(i?(6»))_1  =  c3. 


Differentiating  this  relation  with  respect  to  6  at  6  =  0  gives  [n(F3),  C3]  =  0. 
All  other  relations  are  obtained  similarly.  ■ 

For  more  information  about  vector  operators,  including  the  Wigner- 
Eckart  theorem,  see  Appendix  D  of  [21].  See  also  Exercise  7. 
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17.11  Exercises 


1.  Verify  the  expression  (IT. 2)  for  the  vector  field  x\ d/dx2  —  X2d / dx\. 

2.  Verify  the  relation  (17.12)  in  the  proof  of  Theorem  17.4,  using  induc¬ 
tion  on  j  and  the  commutation  relation  (17.10). 

3.  This  exercise  provides  a  proof  of  Proposition  17.8.  Let  (77,  Vi)  denote 
an  irreducible  representation  of  so(3)  of  dimension  21  +  1  and  let  Cn 
denote  the  Casimir  operator  as  defined  in  the  proposition. 


(a) 

(b) 

(c) 


Show  that  [77 (Fj),  Cn\  =  0  for  all  j  =  1,  2,  3. 

Using  Schur’s  lemma,  show  that  there  is  some  A  E  C  such  that 
Cnv  =  Xv  for  all  v  G  V. 

Show  that 

Cn  =  —  (L\  +  L  -j-  L3)  , 


where  L+,  L  ,  and  L3  are  as  in  Theorem  17.4. 

(d)  By  computing  C 'n  on  some  suitably  chosen  vector  in  V,  show 
that  the  constant  A  in  Part  (b)  has  the  value  —1(1  -j-  1). 


4.  Let  l  be  any  non- negative  integer  or  half-integer.  Construct  a  vec¬ 
tor  space  V  by  decreeing  that  vectors  {lypM, . . . ,  ^2 1}  form  a  basis 
for  V.  Define  operators  L+,  L~,  and  L 3  on  V  by  the  expressions 
in  (17.6).  Show  that  these  operators  satisfy  the  commutation  rela¬ 
tions  (17.8),  (17.9),  and  (17.10). 

Hint :  In  the  case  of  L_,  treat  the  vector  V21  separately  from  the  other 
basis  vectors.  In  the  case  of  the  L+,  treat  the  vector  vq  separately 
from  the  other  basis  vectors. 


5.  Let  (77,  V)  be  an  irreducible  representation  of  so(3)  of  dimension  2, 
with  basis  {^0,^1}  as  in  (17.6).  Consider  V  ©  V  as  a  representation 
of  so(3)  as  in  Sect.  16.8.  Let  v  =  (^0,^1)-  Show  that  the  smallest 
invariant  subspace  of  V  ©  V  containing  v  is  V  ©  V. 

Note:  This  shows  that  V  ®  V  has  a  cyclic  vector,  even  though  V  ®  V 
is  not  irreducible. 


6.  Compute  explicit  bases  for  the  two  irreducible  invariant  subspaces 
Wo  —  V3/2  and  Wq~  =  V1/2  of  V\  <S)  V\/2-  Each  basis  element  for  Wo 
or  Wq-  should  be  expressed  as  a  linear  combination  of  the  elements 
Vj  (8)  ek  in  the  proof  of  Proposition  17.22. 

7.  Let  Vi ,  Vm,  and  Vn  be  irreducible  representation  of  so(3)  of  dimension 
21  +  1,  2m  +  1,  and  2n  +  1,  respectively.  Suppose  that  <f>  and  T  are 
nonzero  intertwining  maps  of  Vi  into  Vm  ®  Vn.  Show  that  4>  =  cT  for 
some  c  G  C. 


17.11  Exercises 
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Hint :  Use  Proposition  17.23  and  Schur’s  lemma. 

Note :  This  result  is  closely  related  to  the  Wigner-Eckart  theorem  for 
“irreducible  tensor  operators.” 


18 

Radial  Potentials  and  the  Hydrogen 
Atom 


18.1  Radial  Potentials 

If  V  is  any  radial  function  on  M3,  let  H  =  —  (h2/(2m))A  +  V  be  the 
corresponding  Hamiltonian  operator,  acting  on  L2(M3).  We  will  look  for 
solutions  to  the  time-independent  Schrodinger  equation  Hif  =  Eip  of  the 
form  ^(x)  =  p(x)/(  |x|),  where  /  is  a  smooth  function  on  (0,  oo)  and  p  is  a 
harmonic  polynomial  on  M3  that  is  homogeneous  of  degree  l. 

Proposition  18.1  Let  p  be  a  harmonic  polynomial  on  M3  that  is  homoge¬ 
neous  of  degree  l  and  let  f  be  a  smooth  function  on  (0,  oo).  Let  if  be  the 
function  on  M3\{0}  given  by 


V’(x)  =  p(x)/(  |x|). 


Then  on  K3\{0}  we  have 


A,0(x)  =  p(x) 


d2f  2(1  +  1)  df 


dr2 


r 


dr 


(18.1) 


Proof.  We  begin  with  the  case  l  =  0,  so  that  p  is  a  constant — which  we 
take  to  be  1 — and  if  is  just  the  radial  function  /(|x|).  Then 


d 

dxj 


/(|x|) 


df  d 
dr  dxj 
df  Xj 
dr  lx 


ry»  2  I  rf»  2  I  ry» 
tAy  1  \~  *aj  O  |  tXy 


2 

3 
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and  so 


j  =  1  3 


dr 2  lv  2  dr 


3 

sr,.|x 

fl+2d£ 

dr 2  r  dr 


1 


x 


,r.2 


X 


For  the  general  case,  the  product  rule  for  the  Laplacian  gives 


Aip  =  (Ap)/(|x|)  +  2Vp  •  V/(|x|)  +pA/(|x|). 

Now,  A p  =  0  by  assumption.  Furthermore,  since  /(|x|)  is  radial,  its  gra¬ 
dient  points  in  the  radial  direction.  Thus,  only  the  radial  component  of 
Vp  is  relevant.  Moreover,  on  each  ray  through  the  origin,  p  behaves  like  a 
constant  times  rl .  Thus,  the  r-derivative  of  p  is  (//r)p,  giving 

21  df  d2f  2  df 

A  ip  =  -p-r+P-r^  +  -P~T' 

r  dr  drz  r  dr 


which  simplifies  to  the  desired  expression.  ■ 

Although  the  decomposition  of  functions  in  Definition  17.18  is  for  many 
purposes  the  most  convenient  one,  it  is  not  quite  the  customary  way  of  turn¬ 
ing  spherical  harmonics  into  functions  on  M3.  Conventionally,  one  works  in 
polar  coordinates  and  considers  functions  of  the  form 


ip(r,e,(p)  =  p(e,(p)g(r), 

where  p  is  the  restriction  to  S2  of  an  element  of  l  ) .  We  can  express  this 
decomposition  in  rectangular  coordinates  as 


tHx)  =p 


p(x) 


l  \l 

|x| 


g(  lxD- 


We  can  then  obtain  a  more  customary  form  of  Proposition  18.1  as  follows. 


Proposition  18.2  Suppose  p  E  Vi  and  f  is  a  smooth  function  on  (0,  oo), 
and  let  if  by  the  function  on  M3\{0}  given  by 


ip(x)  =  p 


x 


x 


5(lxl)- 


Then 


(A  ^)(rx)  =  p(x) 


d2 g  2  dg_  _  l  (l  +  1)  .  . 

dr2  r  dr  r2  9 


for  all  x  G  S2  and  r  E  (0,  oo) 


(18.2) 
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Proof.  Since  p  is  homogeneous  of  degree  Z, 


P 


Thus, 


V’(x)  =  MX) 

Applying  Proposition  18.1  gives 


A  ipfx.) 


2(Z  +  1)  d 
r  dr 


From  here  it  is  straightforward  but  unilluminating  calculation  to  verify  the 
formula  in  the  proposition.  ■ 

Still  another  way  to  write  functions  on  M3  is  in  the  form 


^(x) 


(18.3) 


so  that  h{r )  =  rg{r).  If  we  replace  g(r )  by  h(r)/r  in  (18.2),  we  obtain,  after 
a  short  calculation, 


1 

(A^^rx)  =  — j-p(x) 


d2h  1(1  + 1) 


dr 2 


ry*  * 


h(r ) 


(18.4) 


Writing  wave  functions  in  the  form  (18.3)  is  convenient  because  we  then 
have,  for  any  radial  potential, 


k2 

2m 


Aif  +  R(|x|  )if 


h2  d2h 
2m  dr 2 


+  Veff(r)h(r)  , 


(18.5) 


where  Wff  is  the  effective  potential  given  by 

yeff(r)=V(r)  +  ^+1),  (18.6) 

Note  that  the  quantity  in  square  brackets  in  (18.5)  is  just  an  ordinary  one¬ 
dimensional  Schrodinger  operator,  since  the  first  derivative  term  in  (18.2) 
has  been  eliminated.  Despite  the  naturalness  of  the  form  (18.3),  it  is  the 
form  (18.1)  that  is  ultimately  most  convenient  for  finding  the  bound  states 
of  the  hydrogen  atom  Hamiltonian. 

Now,  as  the  discussion  following  Proposition  9.34  illustrates,  even  if  if 
is  square-integrable  over  M3\{0}  and  A  if  is  square-integrable  over  M3\{0}, 
if  may  not  be  in  the  domain  of  the  Laplacian,  since  the  distributional 
Laplacian  of  if  may  contain  a  term  that  is  supported  at  the  origin.  In 
the  case  of  the  hydrogen  atom,  however,  we  will  consider  functions  if  of 
the  form  (18.1)  where  /  and  df  /dr  are  bounded  near  the  origin  and  have 
exponential  decay  near  infinity.  Proposition  9.35  then  tells  us  that  if  is  in 
the  domain  of  A. 
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18.2  The  Hydrogen  Atom:  Preliminaries 

A  hydrogen  atom  is  formed  out  of  a  single  electron  that  is  “bound”  to  a 
proton  by  means  of  the  electromagnetic  attraction  between  the  oppositely 
charged  particles.  The  study  of  the  hydrogen  atom  is  a  very  important  test 
case  in  quantum  mechanics,  and  the  ability  of  the  Schrodinger  equation  to 
explain  the  observed  energy  levels  of  hydrogen  was  a  crucial  early  success 
of  the  theory. 

A  proton  is  approximately  1,800  times  as  massive  as  an  electron.  Thus, 
to  first  approximation,  we  may  think  of  the  location  of  the  proton  as  being 
fixed,  with  the  electron  “orbiting”  around  this  location.  A  more  careful 
analysis  considers  both  the  proton  and  the  electron  as  orbiting  around 
their  center  of  mass.  The  Hamiltonian  for  the  relative  position  of  the  two 
particles  is  precisely  that  of  a  particle  orbiting  around  a  fixed  center,  except 
that  the  mass  of  the  electron  is  replaced  by  the  reduced  mass  p  of  the 
electron-proton  system.  (See  Exercise  1.)  Here,  as  in  Proposition  2.16  in 
the  classical  case, 

memp 

V  =  - — - > 

me  +  rrip 

where  me  and  mp  are  the  masses  of  the  proton  and  electron,  respectively. 
Since  mp  rae,  the  reduced  mass  is  nearly  the  same  as  the  mass  of  the 
electron. 

After  separating  out  the  motion  of  the  center  of  mass,  we  are  left  with 
the  following  Hamiltonian  for  the  relative  position  of  the  electron: 

H  =  -  —  A  -  pt,  (18.7) 

2 n  |x|  V  ; 

where  Q  is  the  charge  of  the  electron.  (We  use  a  system  of  units,  such 
as  “electrostatic”  or  “Gaussian”  units,  in  which  the  Coulomb  constant  is 
equal  to  1.)  It  follows  from  Theorem  9.38  that  H  is  self-adjoint  on  Dom(A) 

A 

and  that  H  is  bounded  below. 

Note  that  the  classical  Hamiltonian  H (x,  p)  for  a  hydrogen  atom  is  not 
bounded  below.  After  all,  we  can  simply  take  p  =  0  and  take  x  very 
close  to  the  origin.  This  unboundedness  would  cause  strange  behavior  for 
a  hypothetical  classical  hydrogen  atom.  After  all,  modeling  a  hydrogen 
atom  using  the  1/r  potential  is  only  an  approximation.  We  are  using  an 
electrostatic  formula  for  the  force,  the  correct  one  when  the  positions  of  the 
particles  are  held  fixed,  in  a  dynamical  situation.  A  more  realistic  model 
of  hydrogen  takes  into  account  radiation,  that  is,  the  interaction  of  the 
charged  electron  with  the  electromagnetic  fields.  Classically,  a  negatively 
charge  particle  orbiting  a  positively  charged  nucleus  would  radiate,  thus 
giving  up  energy  to  the  electromagnetic  fields.  The  classical  particle  would 
spiral  rapidly  toward  the  origin,  with  the  particle’s  energy  going  to  —  oo  and 
the  energy  of  the  electromagnetic  field  going  to  +oo.  Thus,  if  hydrogen  were 
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made  up  of  classical  charged  particles,  the  electron  would  go  into  a  “death 
spiral”  and  emit  a  giant  burst  of  electromagnetic  radiation. 

Fortunately  for  us,  this  is  not  how  real  particles  behave!  In  actuality,  the 
electron  is  a  quantum  particle.  A  quantum  electron  “orbiting”  a  proton  can 
still  give  up  energy  to  the  electromagnetic  field.  The  Hamiltonian  for  the 
quantum  hydrogen  atom,  however,  is  bounded  below,  as  a  consequence  of 
Theorem  9.38.  Thus,  the  electron  can  only  drop  to  its  ground  state  (the 
state  of  lowest  energy),  at  which  point  it  becomes  stable. 


18.3  The  Bound  States  of  the  Hydrogen  Atom 


Our  goal  in  this  section  is  to  find  the  eigenvectors  for  the  Hamiltonian  H 
in  (18.7)  with  negative  eigenvalues.  Such  eigenvectors  constitute  “bound 
states,”  that  is,  states  in  which  the  electron  is  bound  to  the  proton.  For 
each  negative  number  E,  we  look  at  the  eigenspace  Ve  for  H  with  eigenvalue 
E ,  that  is,  the  space  of  all  ip  E  Dom(i7)  satisfying  Hip  =  Eip.  Since  H  is 
self-adjoint  and,  therefore,  closed,  this  eigenspace  will  be  a  closed  subspace 
of  L2(M3).  Since,  also,  H  commutes  with  rotations,  Ve  will  be  invariant 
under  the  usual  action  (Definition  17.1)  of  SO (3)  on  L2(M3).  Thus,  by 
the  discussion  at  the  end  of  Sect.  17.7,  Ve  decomposes  as  a  direct  sum  of 
finite- dimensional,  irreducible  S0(3)-invariant  subspaces. 

We  now  look  for  such  subspaces  of  Ve-  In  the  following  theorem,  we 
assume  that  the  radial  part  of  the  wave  function  (the  function  /  in  the 
notation  Vij  in  Definition  17.18)  has  a  certain  very  special  form.  After 
analyzing  this  case,  we  argue  that  we  have  found  in  this  way  all  of  the 
eigenvectors  for  H  with  negative  eigenvalues. 


Theorem  18.3  For  each  positive  integer  n,  let 

(18. 

where  Q  is  the  charge  of  the  electron  and  p  is  the  reduced  mass  of  the 
electron-proton  system,  and  let 


En  =  ~ 


/jQ4  1 
2  h2 


m 


Pn(x) 


y  8/i  En 
h 


Then  for  each  l  =  0, 1, . . . ,  n  —  1,  there  exists  a  polynomial  Ln^i  such  that 
for  each  homogeneous  harmonic  polynomial  q  of  degree  l ,  the  function 


ip{x)  =  q(x)e  p"(x)/2L„i;(/9n(x))  (18.9) 


satisfies 

Hip  =  Enfi. 
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It  follows  from  Proposition  9.35  that  the  functions  ip  in  (18.9)  belong  to 

/\ 

Dom(A)  and  thus,  by  Theorem  9.38,  to  Dom (H).  The  polynomials  Lng  are 
the  Laguerre  polynomials.  The  coefficient  of  —  1/n2  in  the  formula  (18.8) 
for  En  is  the  Rydberg  constant  (compare  Sect.  1.2.1). 

Let  us  see  how  to  connect  Theorem  18.3  to  the  usual  expression  for 
the  hydrogen  atom  eigenvectors  in  the  physics  literature.  In  the  first  place, 
physicists  choose  a  certain  basis  for  the  space  of  harmonic  polynomials, 
which  is — up  to  normalization  constants — the  basis  in  Theorem  17.4.  In  the 
second  place,  physicists  write  the  solutions  in  spherical  coordinates.  When 
changing  to  spherical  coordinates,  we  should  keep  in  mind  that  qiy7n  is 
homogeneous  of  degree  l  and  that  pn(x)  is  just  a  constant  multiple  of  the 
distance  from  the  origin.  We  obtain,  then,  the  following  expression: 

il>n,i,m(r,0,(j>)  =Yl<m{e,4>)plne~Pn/2Ln<i{pn),  (18.10) 


where  Y/?m(0,  <p)  is  the  restriction  to  the  unit  sphere  of  p/,m. 

/\ 

Proof.  If  E  is  a  negative  real  number,  we  look  for  solutions  to  Hip  =  Eip 
of  the  form  g(x)/(|x|),  where  q  E  Vi.  Provided  that  f(r)  and  f'(r )  are 
bounded  near  the  origin,  Proposition  9.35  allows  us  to  compute  A  ip  on 
M3  \{0}  without  worrying  about  whether  ip  is  differentiable  at  the  origin. 
Using  Proposition  18.1,  the  equation  for  /  is 


h2 


2 fi  [dr2 


d2f  2(1  +  1)  df 


r 


dr 


Q2  f<  , 

-  w(r) 


Ef(r) 


(18.11) 


For  large  r,  where  the  two  terms  that  involve  a  factor  of  1/r  become  neg¬ 
ligible,  and  so 


h2  d2f 
2 fi  dr2 


(18.12) 


Recalling  that  E  is  negative,  (18.12)  tells  us  that  near  infinity,  /  should 
behave  like  a  combination  of  a  growing  and  a  decaying  exponential.  Since 
we  want  square-integrable  solutions,  we  require  that  only  the  exponentially 
decaying  term  be  present. 

We  therefore  postulate  a  solution  of  the  form 


f(r)  =  exp 


(18.13) 


for  some  function  g.  If  we  plug  (18.13)  into  (18.11)  for  /,  there  are  canceling 
terms  equal  to  Eg(r )  on  each  side,  leaving 


H2 

2  fi 


d2g 

dr2 


_  oA  l-^l  dg  2(1  +  1)  dg 


H  dr 


r 


dr 


2  (/  -f- 1)  y/ 2 fl  E 


r 


h 


g(r) 
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We  now  introduce  the  new  variable  p  =  (y/8/i  \E\/ti)r.  After  making  this 
change  of  variable,  we  find  that  each  term  in  square  brackets  obtains  a 
factor  of  8/jl  \  E\  /h2,  so  that  our  equation  becomes 


h2  8/i  |  E 


d2 g  dg  ^  2(1  +  1)  dg  _  (l  +  1)^^ 


2/i  h2  I dp 2  dp  p  dp 


P 


2^/2p\E\  Q 


h 


p 


■g{p) 


Multiplying  through  by  p  and  simplifying  yields  the  equation. 


d2g  dg  dg 

P~r^2  ~  +  2(Z  +  1)—  + 


dp2  dp 


dp 


Q\/k 

hp2 \E\ 


{l  +  1) 


g(p)  =  0.  (18.14) 


If  we  postulate  for  g  a  power  series  ^2k=0  dkpk ,  we  obtain  the  following 
recurrence  relations  for  the  coefficients: 


^/c+l  —  CLk 


[k  +  /  +  1  —  A] 
k[(k  +  1)  +  2(1  +  1)] 


(18.15) 


where 

Q\/l_ 

Hy/2\E\' 

The  series  for  g  will  terminate,  yielding  a  polynomial  solution  to  (18.14), 
provided  that  A  is  an  integer  n  with  n  >  l  +  1.  We  can  then  solve  for  the 
energy  in  terms  of  n  as  follows: 


F  _  pQ4 

2  n2h2 ' 

Recalling  that  E  is  negative,  we  have  obtained  the  desired  form  for  the 
energy  levels.  Furthermore,  the  condition  n  >  l  + 1  is  the  same  as  l  <  n—  1. 
Finally,  if  we  plug  in  the  formula  for  p  in  terms  of  r  and  the  formula  for  / 
in  terms  of  g,  we  obtain  the  form  of  the  solution  stated  in  the  theorem.  ■ 
It  is  important  to  emphasize  that  the  functions  in  Theorem  18.3  do  not 
span  the  entire  Hilbert  space  Z/2(M3).  After  all,  these  functions  are  all  eigen¬ 
vectors  for  H  with  negative  eigenvalues.  If  these  vectors  spanned  L2(M3), 
then  the  expectation  value  of  the  energy  would  always  be  negative.  But  it 
is  easy  to  produce  functions  i/j  in  the  domain  of  H  for  which  (^,  H'lp)  >  0. 
Simply  take  ^  to  be  a  Gaussian  wave  packet  with  mean  position  far  from 
the  origin  and  with  very  large  mean  momentum.  Then  will  be 

close  to  zero  but  (-0,  P20)  will  be  large  and  positive.  Nevertheless,  it  can 
be  shown  that  the  functions  in  Theorem  18.3  span  the  negative  energy 

subspace  of  L2(M3).  It  is  possible  to  analyze  also  the  positive  part  of  the 

/\ 

spectrum  of  i4,  but  the  spectrum  above  zero  is  purely  continuous  and  rep¬ 
resents  a  hydrogen  atom  that  has  ionized,  that  is,  in  which  the  electron  has 
escaped  from  the  proton. 
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Theorem  18.4  As  n  varies  over  all  positive  integers,  l  varies  from  0  to 
n  —  1,  and  g  varies  over  all  homogeneous  harmonic  polynomials  of  degree 
l ,  the  eigenvectors  in  Theorem  18.3  span  the  negative- energy  subspace  of 
L2(M3),  that  is,  the  range  of  the  projection  ((— oo,  0)),  where  fiH  is  the 
projection- valued  measure  associated  to  H  by  the  spectral  theorem. 


Proof.  The  proof  requires  results  from  spectral  theory  that  go  beyond  the 
machinery  that  we  have  developed  in  Chaps.  9  and  10,  and  which  we  cannot 
reproduce  in  full  here.  Specifically,  we  make  use  of  Theorem  V.5.7  of  [27], 
which  tells  us  that  the  negative-energy  portion  of  the  spectrum  of  H  is 
discrete,  consisting  of  eigenvalues  of  finite  multiplicity  accumulating  only 
at  zero. 

We  indicate  briefly  why  the  above  result  holds.  If  A  and  B  are  unbounded 
self-adjoint  operators,  let  us  say  that  B  is  a  relatively  compact  perturbation 
of  A  if  A(B  —  A/)-1  is  a  compact  operator  for  every  A  in  the  resolvent  set 
of  B.  According  to  Lemma  V.5.8  of  [27],  the  potential  energy  operator 
for  the  hydrogen  atom  is  a  relatively  compact  perturbation  of  the  kinetic 
energy  operator.  This  is  a  strengthening  of  what  we  showed  in  the  proof 
of  Theorem  9.38,  namely  that  the  potential  energy  operator  is  relatively 
bounded  with  respect  to  the  kinetic  energy  operator,  with  relative  bound 
less  than  1.  The  proof  of  relative  compactness  relies  on  the  fact  that  the 
potential  for  the  hydrogen  atom  goes  to  zero  at  infinity. 

Meanwhile,  let  us  say  that  A  belongs  to  the  essential  spectrum  of  an  un¬ 
bounded  self-adjoint  operator  A  if  either  A  is  a  nonisolated  point  in  cr (A) 
or  A  is  an  eigenvalue  for  A  with  infinite  multiplicity.  According  to  The¬ 
orem  IV. 5. 35  of  [27],  a  relatively  compact  perturbation  of  a  self-adjoint 
operator  does  not  change  the  essential  spectrum.  Thus,  the  essential  spec- 
trum  of  H  is  equal  to  the  essential  spectrum  of  the  kinetic  energy  operator, 
which  is  certainly  contained  in  [0,  oo),  since  the  kinetic  energy  operator  is 
non-negative.  It  follows  that  any  point  in  the  negative-energy  part  of  the 
spectrum  of  H  must  be  an  isolated  point  in  cr(H)  and  an  eigenvalue  of 
finite  multiplicity. 

/\ 

In  light  of  the  preceding  result,  there  is  no  continuous  spectrum  for  H 
below  zero,  and  we  need  only  look  for  square-integrable  eigenvectors.  Since, 
also,  each  eigenspace  for  H  with  eigenvalue  E  <  0  is  finite  dimensional,  it 
will  decompose  as  a  direct  sum  of  irreducible,  SO(3)-invariant  subspaces. 
Such  subspaces,  according  to  Proposition  17.19,  are  always  of  the  form  Vij 

for  some  l  and  /,  where  Vij  is  as  in  Definition  17.18.  Thus,  we  look  for 

/\ 

functions  ip  of  the  form  ^(x)  =  p(x)/(|x|)  such  that  Hip  =  Eip  for  some 
E  <  0. 


Now,  if  a  function  of  the  form  p(x)/(|x|)  is  to  be  an  eigenfunction  of 
the  Hamiltonian,  /  must  satisfy  the  differential  equation  (18.11).  By  ele¬ 
mentary  results  from  the  theory  of  linear  ordinary  differential  equations, 
this  equation  has  precisely  two  linearly  independent  solutions,  for  any  value 
of  E.  Both  solutions  can  be  constructed  by  postulating  a  solution  of  the 
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form  (18.13),  introducing  the  new  variable  p,  and  then  using  a  power  series 
expansion  for  g(p)  (Exercise  9).  One  of  the  solutions  for  g{p)  will  have  a 
power  series  starting  with  p~^2l+1\  in  which  case  ^(x)  will  blow  up  like 

1/  |x|^+1^  near  the  origin;  such  a  function  is  not  in  the  domain  of  the  Hamil¬ 
tonian  (Exercise  14  in  Chap.  9).  The  other  solution  for  g(p)  will  start  with 
p°  and  may  be  obtained  by  using  the  form  (18.13),  changing  from  the  vari¬ 
able  r  to  the  variable  p,  and  then  using  the  recurrence  relation  (18.15)  to 
define  the  coefficients  of  a  power  series.  If  the  resulting  series  does  not  ter¬ 
minate,  it  is  not  hard  to  see  that  the  terms  will  behave  for  large  k  like  the 
series  for  ep .  Since  the  function  /  is  equal  to  e~p^2g(p),  this  function  will 
grow  like  ep/2  near  infinity,  which  means  that  ip  will  not  be  in  L2(M3).  Thus, 
to  get  a  square-integrable  solution,  the  series  for  g(p)  must  terminate,  in 
which  case  ip  is  one  of  the  functions  in  Theorem  18.3.  ■ 


Corollary  18.5  Each  eigenvalue  En,  as  given  in  Theorem  18.3,  has  mul¬ 
tiplicity  n2 . 

Proof.  According  to  Theorem  18.4,  the  eigenvectors  in  Theorem  18.3  con- 
stitute  all  of  the  eigenvectors  for  H  with  eigenvalue  En.  The  number  of 
independent  eigenvectors  with  eigenvalue  En  is  thus  the  sum  of  the  dimen¬ 
sions  of  the  spaces  V\  of  spherical  harmonics,  with  Z  =  0,  l,...,n  —  1.  This 
number  is,  by  Theorem  17.12, 


n  —  1 

^(2/  +  1) 


1=0 


as  claimed.  ■ 


18.4  The  Runge-Lenz  Vector  in  the  Quantum 
Kepler  Problem 

In  Sect.  2.6,  we  showed  that  the  classical  Kepler  problem  can  be  solved 
almost  completely  by  making  use  of  the  Runge-Lenz  vector,  which  is  a  con¬ 
served  quantity.  The  quantum  version  of  the  Runge-Lenz  vector  commutes 
with  the  Hamiltonian  and  can  elucidate  a  number  of  special  properties  of 
the  quantum  Kepler  problem,  which  we  typically  think  of  as  describing  a 
hydrogen  atom.  In  particular,  the  Runge-Lenz  vector  will  help  to  explain 
(1)  the  simple  form  — R/n 2  of  the  negative  energies  of  the  hydrogen  atom 
and  (2)  the  apparent  coincidence  by  which  energy  of  the  states  in  (18.9) 
is  independent  of  l  for  a  given  n.  Note  that  the  rotational  symmetry  of 
the  problem  explains  why  the  energy  of  the  states  in  (18.9)  is  indepen¬ 
dent  of  the  choice  of  the  harmonic  polynomial  q.  Nevertheless,  rotational 
symmetry  cannot  explain  why  states  for  different  values  of  l — and  thus  dif¬ 
ferent  radial  dependence  in  the  wave  function — have  the  same  energy.  This 
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apparent  coincidence  will  be  explained  by  an  additional  symmetry  of  the 
problem,  that  is  expressible  in  terms  of  the  Runge-Lenz  vector.  See  also 
Sect.  7  of  [17]  for  a  somewhat  different  (but  related)  explanation  for  the 
structure  of  the  eigenvalues  of  the  hydrogen  atom  and  their  multiplicities. 

There  are  several  computations  involving  the  Runge-Lenz  vector  that, 
while  elementary,  are  laborious.  Those  computations  are  deferred  to 
Sect.  18.6. 


18.4-1  Some  Notation 

To  keep  the  notation  as  simple  as  possible,  we  will  adopt  in  this  section 
Einstein’s  summation  convention ,  which  states  that  repeated  indices  are 
always  summed  on,  even  if  there  is  no  summation  sign  written.  In  this 
section,  the  sum  will  always  range  from  1  to  3.  Using  this  convention,  we 
write,  say,  the  dot  product  of  two  vectors  u,  v  in  M3  as  u  •  v  =  UjVj, where 
the  summation  convention  frees  us  from  having  to  write  out  explicitly  the 
sum  over  j. 

We  will  make  frequent  use  of  the  totally  antisymmetric  symbol  Sjki,  where 
j,  fc,  and  l  range  from  1  to  3,  defined  as  follows, 

Definition  18.6  For  j,k,l  G  {1,2,3},  define  e^hi  by  the  formula 

{1  if  (j,  fc,  l)  is  an  even  permutation  of  { 1,2,3) 

—  1  if  {j NS)  is  an  °dd  permutation  of  (1,  2,  3) 

0  if  any  two  of  j,  fc,  l  are  equal 


Thus,  for  example,  £321  =  —  1  and  £212  =  0.  The  commutation  relations 
for  the  basis  {Ei,  T2,  T3}  for  so(3)  may  be  written  (using  the  summation 
convention!)  as 


(18.16) 


For  instance,  if  we  take  j  —  1  and  k  =  2  in  (18.16),  then  the  sum  on  l  gives 
a  nonzero  value  only  when  l  =  3,  and  we  recover  the  relation  [T},  Ffi\  =  T3. 


18.4-2  The  Classical  Runge-Lenz  Vector ,  Revisited 

We  have  already  introduced,  in  Sect. 2. 6,  the  Runge-Lenz  vector  A  in  the 
classical  mechanics  of  a  particle  moving  in  a  1/r  potential.  We  require  a  few 
more  properties  of  A  before  turning  to  the  quantum  version.  We  consider 
a  classical  particle  in  M3  with  Hamiltonian  given  by 


iL(x,p) 


(18.17) 


This  is  just  the  Hamiltonian  for  the  classical  Kepler  problem,  except  that 
we  replace  the  mass  m  of  the  planet  by  the  reduced  mass  g  of  the  electron- 
proton  system,  and  we  replace  the  constant  k  :=  mMG  by  Q2. 
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For  the  Hamiltonian  in  (18.17),  the  Runge-Lenz  vector  is  given  by  the 
formula 

A(x,p)  =  ~7^P  X  J  -  — , 

HQ  X 

where  J  :=  x  x  p  is  the  angular  momentum.  By  Proposition  2.34,  the 
Runge-Lenz  vector  is  a  conserved  quantity  for  the  classical  Kepler  prob¬ 
lem,  in  addition  to  H  and  J,  which  are  conserved  quantities  for  any  radial 
potential.  By  results  of  Sect.  2.6,  we  have  the  following  relations  among 
these  conserved  quantities: 


A- J  =  0 


2 


=  1  + 


2  H 

HQ4 


Lemma  18.7  The  Runge-Lenz  vector  A  and  the  Hamiltonian  H  in  (18.17) 
satisfy  the  following  Poisson  bracket  relations: 


{A3,H}  =  0 

2 

{Aj  t  Am)  —  (lQ^  ^l171^  ^ l  n 


(18.18) 


We  have  already  shown  that  the  Runge-Lenz  vector  is  a  conserved  quan¬ 
tity  (Proposition  2.34),  which  is  equivalent  (Proposition  2.25)  to  saying  that 
the  Poisson  bracket  of  A3  with  H  is  zero,  as  claimed.  The  proof  of  (18.18) 
is  deferred  to  Sect.  18.6.  We  now  introduce  certain  combinations  of  the 
Runge-Lenz  vector,  the  angular  momentum,  and  the  Hamiltonian  that 
form  a  Lie  algebra  under  the  Poisson  bracket.  In  the  construction  of  these 
functions,  we  need  to  take  a  square  root  of  the  Hamiltonian,  which  necessi¬ 
tates  separating  the  positive-energy  and  negative-energy  parts  of  the  phase 
space.  Our  interest  is  primarily  in  the  negative-energy  case. 


Definition  18.8  Let  U  denote  the  negative- energy  part  of  the  classical 
phase  space, 

U~  =  {  (x,  p)  G  M6 1  if  (x,  p)  <  0}  . 

Consider  on  U~  the  normalized  Runge-Lenz  vector  B  given  by 


Define  also  vector-valued  functions  I  and  K  on  U  by 

J  +  B  J  -  B 

- ;  k  = - 


i 


2 


2 
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Theorem  18.9  The  functions  I  and  K  Poisson- commute  with  the  Hamil¬ 
tonian  and  satisfy  the  following  Pois son-bracket  relations  on  the  negative- 
energy  set  U~ : 


{Ijih c}  =  £jkih 
{Kj,Kk}  =  SjkiKi 

{Ij,Kk}  =  0. 


The  functions  I  and  K  also  satisfy  the  following  algebraic  relations: 


K 


8  \H\ 


In  Theorem  18.9,  we  use  the  summation  convention  introduced  in  the 
previous  subsection.  The  proof  of  this  theorem  is  elementary  but  rather 
laborious,  and  is  deferred  to  Sect.  18.6. 

The  span  of  the  functions  ii,/2,/3  and  Afi,  Ab,  A3  on  Z7_,  which  is 
the  same  as  the  span  of  the  functions  L>i,L>2,i33  and  Ji,  J2,  J3,  forms  a 
6-dimensional  Lie  algebra  under  the  Poisson  bracket.  Comparing  the  Poisson- 
bracket  relations  among  the  P s  and  among  the  A’s  to  the  relations  among 
the  basis  elements  A\ ,  ib ,  A3  for  so(3),  we  see  that  the  span  of  the  P s  and 
the  span  of  the  A’s  are  both  isomorphic  to  so(3)  [or,  if  you  prefer,  to  su(2)]. 
Since  also  each  I3  commutes  with  each  A&,  the  6-dimensional  Lie  algebra 
spanned  by  the  P s  and  the  1C s  is  isomorphic  to  so(3)  ®  so(3).  Meanwhile, 
as  demonstrated  in  Exercise  4,  so(3)®so(3)  is  isomorphic  to  the  Lie  algebra 
so(4).  Since  all  the  P s  and  A’  s  Poisson-commute  with  the  Hamiltonian,  we 
say  that  the  Kepler  problem  has  so (4)  symmetry.  This  is  in  contrast  to  the 
dynamics  of  a  particle  moving  in  M3  in  the  force  generated  by  a  typical 
radial  potential,  which  has  only  so(3)  symmetry. 

To  be  more  precise,  “so (4)  symmetry”  prevails  only  on  the  negative- 
energy  subset  U~  of  the  classical  phase  space.  On  the  positive-energy  subset 
17+ ,  the  span  of  the  functions  A>i,A>2,A>3  and  Ji,  J2,  A3  again  forms  a  6- 
dimensional  Lie  algebra.  This  Lie  algebra,  however,  is  not  isomorphic  to 
so(4),  but  rather  to  so(3, 1),  where  so(3, 1)  is  the  Lie  algebra  of  the  group  of 
4x4  matrices  that  preserve  the  quadratic  form  x\  -j-x^  +£§  —  x\.  The  reason 
the  formulas  on  Z7+  are  different  from  those  on  U~  is  that  calculations  of 
the  relevant  Poisson  brackets  involves  the  function  H /  \H\  ,  which  has  the 
value  1  on  Z7+  and  the  value  —1  on  U~ .  (The  factor  of  H  comes  from 
Lemma  18.7  and  the  factor  of  \H\  from  the  factor  of  y/|iA|  in  the  definition 
of  B.) 


18.4-3  The  Quantum  Runge-Lenz  Vector 

A 

We  now  introduce  the  quantum  counterpart  A  of  the  classical  Runge-Lenz 
vector  A.  The  quantum  Runge-Lenz  satisfies  most  of  the  same  properties 
as  the  classical  version,  with  a  few  small  but  crucial  “quantum  corrections.” 
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Definition  18.10  Define  the  quantum  Runge-Lenz  vector  by 


1  1 

A  =  — --  (P  x  J 


gQ2  2 


J  x  P 


Note  that  in  the  quantum  case,  —  J  x  P  is  not  the  same  as  P  x  J,  because  of 

A 

the  noncommutativity  of  the  factors.  The  particular  combination  of  P  x  J 

A  _  _ 

and  J  x  P  in  Definition  18.10  is  used  because  it  is  yields  a  self-adjoint 
operator.  The  Runge-Lenz  vector  can  also  be  computed  as 


A 


1 

HQ2 


(Pxj 


iKP 


(18.19) 


as  will  be  verified  in  Sect.  18.6. 

In  the  interests  of  keeping  the  exposition  manageable,  we  will  not  concern 
ourselves  in  what  follows  with  determining  the  precise  domains  on  which 
various  identities  hold. 


Proposition  18.11  The  quantum  Runge-Lenz  vector  A  satisfies  the  fol¬ 
lowing  relations: 


A  J  =  J  A  =  0 

AA  =  1  +  ;s?(J'J+'1)-  ,18- 

Note  that  there  is  a  “quantum  correction”  in  (18.20);  the  factor  of  J  •  J 
in  the  classical  expression  for  A  •  A  is  replaced  by  J  •  J  +  hz.  This  correction 
gives  rise  to  a  quantum  correction  in  (18.22),  which  in  turn  is  essential 
to  getting  the  correct  value  for  the  energy  eigenvalues  in  Corollary  18.17. 
The  proof  of  this  result  and  the  other  results  of  this  section  are  deferred  to 
Sect.  18.6. 


Lemma  18.12  The  quantum  Runge-Lenz  vector  A  and  the  Hamiltonian 

A 

H  satisfy  the  following  commutation  relations: 


1 

ih 


1 

ih 


0 


fjQ' 


A.  A 

£jml  J i  H . 


(18.21) 


Note  that  since  H  commutes  with  rotations,  it  commutes  with  the  angu- 

A 

lar  momentum  operators  Ji.  Thus,  in  (18.21),  we  could  just  as  well  write 

A  A  A  A 

HJi  in  place  of  J[H .  As  in  the  classical  case,  if  we  normalize  the  com¬ 
ponents  of  the  Runge-Lenz  vector  by  dividing  by  the  square  root  of  the 
Hamiltonian,  then  these  operators  together  with  the  angular  momentum 
operators  form  a  6-dimensional  Lie  algebra. 
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Definition  18.13  Let  V  denote  the  negative- energy  subspace  o/L2(M3), 

A  A 

that  is,  the  range  of  the  spectral  projection  gH ((— oo,  0)).  Let  \H\  denote 

A  A 

the  restriction  to  V~  of  the  operator  —H.  On  V~ ,  define  operators  B  by 


A  A 

Define  also  operators  I  and  K,  as  in  the  classical  case,  by 


J  +  B 


K 


J-B 


It  is  possible  to  define  the  absolute  value  of  any  self-adjoint  operator 

/\ 

by  means  of  the  functional  calculus.  However,  since  the  restriction  of  H 
to  V~  is,  by  definition,  negative  definite,  the  restriction  of  \H\  to  V~  co- 

incides  with  the  restriction  to  V~  of  —H.  The  operator  l/y\H\  is  the 
operator  with  a  restriction  to  the  energy  eigenspace  with  eigenvalue  En 
that  is  1/ yJ\En\I.  The  components  of  B  are  unbounded  operators,  defined 
on  suitable  dense  subspaces  of  the  Hilbert  space  V~ . 

A  A 

Theorem  18.14  The  operators  I  and  K  commute  with  the  Hamiltonian 

/s 

H  and  satisfy  the  following  commutation  relations: 


1 

ih 


£jklh 


=  SjklKi 

=  0. 


These  operators  also  satisfy  the  following  algebraic  relations: 


I  I  =  K  K  = 


8\H\ 


(18.22) 


18-4-4  Representations  of  so (4) 

In  light  of  the  commutation  relations  in  Theorem  18.14,  we  can  define  a 
representation  7 r  of  the  Lie  algebra  so(4)  =  so(3)  ©  so(3)  on  the  negative- 
energy  subspace  V~  as  follows: 

7r(i?i’° )  =  7r('0,F4  =  fh^j'  (18.23) 

It  is  therefore  desirable  to  classify  the  irreducible  finite-dimensional  repre¬ 
sentations  of  so(3)  ©  so(3),  which  we  do  in  the  following  proposition. 
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Proposition  18.15  Suppose  Vk  and  V\  are  irreducible  representations  of 
so(3)  of  dimensions  2k + 1  and  21  +  1,  respectively.  ThenVk0Vi  is  irreducible 
when  viewed  as  a  representation  of  so(3)  ®  so(3)  as  in  Remark  16.49.  Fur¬ 
thermore,  every  irreducible  finite- dimensional  representation  of  so(3)®so(3) 
is  isomorphic  to  Vk  0  Vi  for  a  unique  ordered  pair  (fc,  l). 

For  any  representation  Vk  0  Vi  of  so(3)  ®  so(3),  define  Casimir  operators 
Ci  and  C2  by  the  formula 


3 


Tk{Fjf®I- 

3  = 1 


3 


C2  =  ^/®7T/(Fi)2. 

3  = 1 


Then  we  have 

Ci  =  -k(k  +  1 )  J;  C2  =  ~l(l  +  1  )L 


Proof.  To  classify  the  irreducible  representations  of  so(3)  ®  so(3),  we  could 
appeal  to  the  general  theory  of  representations  of  direct  sums  of  Lie  alge¬ 
bras.  It  is  not  hard,  however,  to  give  a  direct  proof  using  the  same  sort 
of  reasoning  we  used  in  the  classifications  of  irreducible  representations 
of  so(3).  We  will  omit  the  details  of  this  computation.  The  result  on  the 
Casimir  operators  follows  easily  from  Proposition  17.8.  ■ 

In  any  finite-dimensional  subspace  of  V~  that  is  invariant  and  irreducible 
under  the  action  of  so(3)®so(3)  in  (18.23),  the  Casimir  operators  are  given 
by  C\  =  —  lT/h2  and  C2  =  —K-K /h2 .  Since,  by  Theorem  18.14, 11  =  K-K 
on  V~,  all  of  the  irreducible  representations  of  so(3)®so(3)  that  arise  inside 
V~  will  be  of  the  form  Vk  0  Vk. 

Theorem  18.16  Let  denote  the  eigenspace  for  the  Hamiltonian  with 
eigenvalue  En.  Then  is  invariant  and  irreducible  under  the  action  of 

so(3)  ®  so(3)  in  (18.23).  More  specifically,  we  have  the  isomorphism 

W^^Vk®Vk, 


as  representations  of  so(3)  ®  so(3),  where  k  =  (n  —  l)/2  and  where  Vk  is 
the  irreducible  representation  of  so(3)  of  dimension  2k  +  1  =  n. 

Corollary  18.17  If  n,  k,  and  are  as  in  Theorem  18.16,  then  for  all 
fj  e  Wbb,  we  have 


i-U  =  j-Jk  =  h2k(k  +  i). 


Using  (18.22),  the  eigenvalue  En  of  H  on  TV") 


can  be  solved  for  as 


iaQ4  _  m2 

8 h2(k  +  \)2  2h2n2  ’ 


The  expression  for  En  in  Corollary  18.17  is  the  same  as  in  Theorem  18.3. 
The  remarkable  thing  about  the  proof  of  Theorem  18.17  is  that  it  is  purely 
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algebraic,  relying  only  on  the  commutation  relations  among  the  operators 

a  a 

Ik  and  Ki ,  along  with  the  relationship  (18.22)  between  the  Hamiltonian 

a  a  a 

operator  H  and  the  Ik  s  and  K\  s. 


Proof  of  Corollary  18.17.  It  is  easily  seen  that  the  operators  I  •  I  and 

A  A 

K  •  K,  when  restricted  to  an  irreducible  subspace  for  the  action  of  so (3)  ® 
so(3),  are  equal  to  —h2C\  and  —h2C2,  where  C\  and  C2  are  the  Casimir 
operators  appearing  in  Proposition  18.15.  Thus,  if  IT ^  is  isomorphic  to 
Vk  Vk ,  with  k  =  (n  —  l)/2,  then  I -I  and  K-K  will  be  equal  to  h2k{k-\- 1)/, 

A  A  A  A 

as  claimed.  On  the  other  hand,  IT  and  K-K  are  related  to  the  Hamiltonian 

/\ 

H  by  (18.22),  from  which  we  can  solve  for  En.  ■ 

A 

Proof  of  Theorem  18.16.  Since  each  component  of  A  and  J  commutes 

A  A  A  A 

with  i7,  each  component  of  I  and  K  will  also  commute  with  H .  Each 


eigenspace  of  H  is  therefore  invariant  under  the  action  of  I  and  K.  Since 
the  P s  and  iC’s  are  self-adjoint  and  is  finite  dimensional,  will 

decompose  as  a  direct  sum  of  irreducible  invariant  subspaces.  By  Proposi¬ 
tion  18.15,  these  irreducible  subspaces  will  be  of  the  form  Vk  0  b,  where 
Vk  and  V\  are  irreducible  representations  of  so(3)  of  dimension  2k  +  1  and 

A  A  A  A 

21  +  1,  respectively.  But  now,  the  operators  I  •  I  and  K  •  K,  when  restricted 
to  one  of  the  irreducible  subspaces  of  W^n\  are  equal  to  —h2C\  and  —h2C2, 
where  C\  and  C2  are  the  Casimir  operators  appearing  in  Proposition  18.15. 

A  A  A  A 


Since  I  I  =  K  K  on  all  of  V  ,  the  eigenvalues  of  G\  and  C2  must  be  equal 
on  each  irreducible  subspace  of  W^nK  Thus,  we  must  have  k  =  /,  meaning 
that  only  irreducible  subspaces  of  the  form  Vk  (8)  Vk  arise. 

Now,  under  the  isomorphism  of  some  irreducible  subspace  of  with 


Vk  (8)14,  the  operators  Ik  and  Kk  act  as  ihFk^I  and  ihI®Fk,  respectively, 

AAA  A 

where  the  F^s  are  the  usual  basis  for  so(3).  Since  J  =  I  +  K,  each  Jk  acts 
as  ih(Fk  (8)  I  +  I  (8)  Fk).  This  means  that  b  0  b,  under  the  action  of  the 
Jfe’s,  can  be  thought  of  as  a  tensor  product  of  two  representations  of  so(3), 
viewed  as  another  representation  of  so (3)  as  in  Definition  16.48.  Viewed 
this  way,  Vk  (8)  Vk  decomposes  as  in  Proposition  17.23  as 


Vk  (8)  Vk  —  Vq  ®  V\  ®  •  •  •  ®  V2k'  (18.24) 

On  the  other  hand,  we  know  from  Theorem  18.3  that  decomposes 

under  the  action  of  so (3)  as 

Vo  ®  V\  ®  •  •  •  ®  Vn— i-  (18.25) 

Thus,  the  space  of  the  form  Vk  0  Vk  must  be  all  of  W^;  if  there  were 
another  term  then  the  trivial  representation  Vo  would  occur  more  than 
once  in  W^nK  This  being  the  case,  matching  the  decompositions  (18.24) 
and  (18.25)  requires  that  2k  =  n  —  1,  as  claimed  in  the  theorem.  ■ 

The  proof  of  Theorem  18.16  relies  to  some  extent  on  the  results  of 
Sect.  18.3.  Using  only  algebraic  manipulations  involving  the  Runge-Lenz 

.A 

vector,  however,  we  could  still  argue  that  the  eigenvalues  of  H  must  be  of 
the  form  given  in  Corollary  18.17.  We  would  not,  however,  know  that  for 
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every  positive  integer  n,  the  number  En  is  actually  an  eigenvalue  for  H . 
We  would  also  not  know  that  each  eigenspace  W ^  is  irreducible  under  the 
action  of  so(4);  conceivably,  based  only  on  the  algebra,  could  have, 

say,  dimension  2 n2  instead  of  n2. 


18.5  The  Role  of  Spin 

The  spin  of  the  electron  is  1/2.  As  discussed  in  Sect.  17.8,  this  means 
that  the  Hilbert  space  for  an  electron  is  L2(M3)(g)V i/2,  where  V/ /2  is  a 
2-dimensional  vector  space  that  carries  an  irreducible  projective  unitary 
representation  of  SO (3).  Up  to  now,  we  have  neglected  the  spin  in  our 
calculations.  The  reason  for  this  omission  is  simple:  to  first  approximation, 
the  spin  plays  no  role  in  the  calculation.  Specifically,  in  the  simplest  model 
of  a  hydrogen  atom  with  spin,  the  Hamiltonian  is  simply  H  ®  /,  where  H 
is  the  operator  in  (18.7),  acting  on  L2(M3).  For  any  n  >  0,  we  can  obtain  a 
basis  of  eigenvectors  for  H  <g>  I  with  eigenvalue  En  by  taking  vectors  of  the 
form  ®  6j,  where  the  n,i,m ’s  are  as  in  (18.10)  and  where  {ei,e2} 

forms  a  basis  for  Vi/2. 

Now,  from  the  point  of  view  of  rotational  symmetry,  the  basis 
is  not  the  most  natural  one.  Rather,  we  should  decompose  the  eigenspaces 
into  irreducible  invariant  subspaces  for  the  (projective)  action  of  SO (3), 
where  SO (3)  acts  on  both  L2(M3)  and  Vi/2.  We  have  already  decomposed 
the  eigenspaces  inside  L2(M3)  into  irreducible  invariant  subspaces,  namely 
the  span  of  where  n  and  l  are  fixed  and  m  varies.  Thus,  to  obtain 

the  irreducible  invariant  subspaces  inside  L2(M3)(g)U1/ 2,  we  use  the  method 
of  “addition  of  angular  momentum”  from  Sect.  17.9.  According  to  Proposi¬ 
tion  17.22,  Vi  <8)Vi/2  is  irreducible  if  l  =  0  and  isomorphic  to  Vi+i /2  ®  V/_i  /2 
if  l  >  0.  Consider,  for  example,  the  case  n  =  3,  /  =  1,  the  so-called  “3 p 
states”  in  traditional  chemistry  terminology.  Since  V\  ®  Vi/2  decomposes 
as  V3/2  ©  UV2,  when  we  take  spin  into  account,  we  obtain  a  4-dimensional 
space  and  a  2-dimensional  space.  We  can  obtain  bases  for  these  spaces  by 
tracing  through  the  proof  of  Proposition  17.22. 

The  decomposition  described  in  the  previous  paragraph  is  essential  when 
considering  the  “fine  structure”  of  hydrogen.  Our  model  of  hydrogen  using 
the  Hamiltonian  (18.7)  is  only  a  first  approximation.  More  realistic  mod¬ 
els  take  into  account  various  corrections,  including  radiative  corrections,  a 
finite  size  for  the  nucleus,  and  “spin-orbit  coupling,”  among  other  things. 
The  notion  of  spin-orbit  coupling  adds  a  term  into  the  Hamiltonian  involv¬ 
ing  the  operator  J  •  <7,  where  or,  cr2,  and  03  are  the  operators  describing 
the  action  of  so(3)  on  Uw2.  When  this  term  is  included,  the  Hamiltonian 
is  no  longer  of  the  form  A®  I  for  some  operator  A  on  L2(M3).  Thus,  we 
can  no  longer  simply  append  the  spin  to  the  end  of  the  computation,  but 
must  take  it  into  account  from  the  beginning. 
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The  various  corrections  to  the  Hamiltonian  for  the  hydrogen  atom  have 
the  effect  of  reducing  the  multiplicities  of  the  eigenvalues.  Almost  any  cor¬ 
rection  we  make,  for  example,  will  destroy  the  independence  of  the  eigen¬ 
value  on  l  for  a  given  n,  simply  because  the  correction  terms  in  the  Hamilto¬ 
nian  will  not  commute  with  the  quantum  Runge-Lenz  vector.  Nevertheless, 
all  of  the  corrections  that  make  up  the  fine  structure  of  hydrogen  preserve 
the  rotational  symmetry  of  the  problem.  Thus,  the  same  irreducible  repre¬ 
sentations  of  SO (3)  that  we  had  in  the  simple  model  will  appear  after  the 
corrections  are  made.  For  n  =  2,  l  =  1,  for  example,  we  will  still  have  a 
4-dimensional  space  and  2-dimensional  space,  but  these  two  spaces  will  no 
longer  have  the  same  energy. 


18.6  Runge-Lenz  Calculations 


In  this  section,  we  fill  in  many  of  the  computations  that  we  passed  over 
without  proof  in  Sect.  18.4.  Although  all  the  calculations  are,  in  principle, 
elementary,  there  are  a  number  of  nonobvious  tricks  that  help  simplify 
the  algebra.  We  will  make  frequent  use  of  the  concepts  of  functions  that 
transform  like  vectors  (on  the  classical  side)  and  of  vector  operators  (on 
the  quantum  side),  including  Propositions  17.25  and  17.27  (Sect.  17.10). 
In  particular,  we  note  that  the  position  x,  the  momentum  p,  the  angular 
momentum  j,  and  the  Runge-Lenz  vector  A  all  transform  like  vectors, 
and  that  the  corresponding  quantum  quantities  are  all  vector  operators. 
(Compare  Exercise  7.)  In  the  u£v  notation  of  Sect.  18.4.1,  Proposition  17.27 
takes  the  form 

—  [Cj,Jk]  =  —[Jj,Ck\  =  SjkiCi.  (18.26) 

In  the  quantum  mechanical  calculations,  there  are  a  number  of  “quantum 
corrections,”  in  which  dot  products  and  cross  products  of  vector  operators 
do  not  behave  as  they  do  in  the  classical  case. 


Lemma  18.18  The  £ -function  in  Definition  18.6  satisfies  the  relations 


^jkl^jmn 
£  jkl&  jkm 


^km^l  n  ^kn^l 


m 


The  proof  of  these  results  is  not  difficult  and  is  left  to  the  reader  (Ex¬ 
ercise  6).  The  following  identities  involving  the  cross  product  of  vector 
operators  will  be  useful  to  us. 


Lemma  18.19  If  C,  D,  and  E  are  arbitrary  vector  operators,  we  have 


C  •  (D  x  E)  =  (C  x  D)  •  E 

CxD  +  DxC  =  £jki  [C/c,  Di 

CxC  =  \ejki[Ck,Ci\- 


(18.27) 

(18.28) 


(18.29) 
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In  particular,  if  the  different  components  of  C  commute,  then  C  x  C  =  0. 
Finally, 

(C  x  (D  x  E))j  =  CkDjEk  —  CkDkEj.  (18.30) 


As  special  cases  of  these  results,  we  have 


JxP  +  PxJ  =  2iKP  (18.31) 

J  x  J  =  ihJ  (18.32) 

Note  that  if  the  entries  of  D  and  E  commute,  then  the  right-hand  side 
of  (18.30)  reduces  to  the  classical  expression,  (C  •  E)D  —  (C  •  D)E.  Us¬ 
ing  (18.31),  we  can  easily  verify  the  alternative  expression  (18.19)  for  the 
Runge-Lenz  vector. 

Proof.  The  right-hand  side  of  (18.27)  is  computed  as  SjkiCkDiEj .  If  we 
note  that  Sjki  =  £kij  and  then  relabel  the  indices,  we  obtain  SjkiCjDkEi , 
which  is  equal  to  the  left-hand  side  of  (18.27).  For  (18.28),  we  compute 
that 


(C  x  D  +  D  x  C )j  —  SjkiCkDi  +  SjkiDkCi 

=  SjkiCkDi  +  SjkiCiDk  —  Sjki  [< Ci,Dk  . 


(18.33) 


If  we  note  that  Sjki  =  —Sjik  and  then  relabel  the  indices  k  and  /,  we  see 
that  SjkiCiDk  =  — SjkiCkDi ,  so  that  the  first  two  terms  in  the  second  line 
of  (18.33)  cancel.  The  remaining  term  can  be  put  into  the  claimed  form  by 
relabeling  the  indices  k  and  1.  The  identity  (18.29)  is  just  the  D  =  C  case 
of  (18.28).  Finally,  (18.30)  follows  easily  from  Lemma  18.18. 

To  obtain  (18.31)  and  (18.32),  we  apply  (18.28)  and  (18.29),  respectively. 
Since  both  J  and  P  are  vector  operators,  the  desired  result  follows  easily 
from  Lemma  18.18.  ■ 

We  now  turn  to  the  proofs  of  the  results  of  Sect.  18.4.  We  prove  only  the 
quantum  versions  of  the  results,  since  the  classical  results  are  extremely 
similar,  except  that  certain  quantum  corrections  can  be  ignored. 

/\ 

Proof  of  Lemma  18.12,  First  Part.  We  begin  by  showing  that  Aj 

/\  >s 

commutes  with  H  for  each  j.  Since  H  commutes  with  J,  we  have 


/S  /V 


/\  /\ 


— J  -  (  £jkl  [Pk  ,H]Jl  —  Jk[Ph  H 


M<222 


])- 

\xVh} 

V 

X  ’ 

Meanwhile,  since  the  P’s  commute  among  themselves,  we  have 


[Pk,H 
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Thus, 


£  jkl  [Pk  •>  J l 


ifiQ  Sjkl^lmn  q  Xm_Pn 

X3 


—  iKQ  jm^kn  & jn^km) 


X 


k 


IX 


-Xrr>  Pi 


rr  i ±  n 


1 


—  ihQ  q  i^XnX j Pn  XrnXrnPj^ 


-ihQ‘ 


X 
1 


X 


(Xj(X  •  P)  —  (X  •  X)Pj) 


(18.34) 


We  compute  £jkiJk[Pi,  H]  in  a  similar  way.  Note  that  Jk  =  £/cmnXmPn  = 
ZkmnPnXm,  since  Xm  and  Pn  commute  except  when  m  =  n,  in  which  case 
£kmn  =  0.  The  result  is 

1 


ejkiJk[PuH\  =  -ih(Pj(X  •  X)  -  (P  •  X)Xj) 


X 


3  * 


Meanwhile,  since  the  X’s  commute  among  themselves,  we  have 

,  H 


X 


Xj  P 2 


X|’  2 /i 


1 


W 


2/i  L  X 
ih  I  1 


ftft 


Pk  +  X  Pk 

2/i 


Xi 


J 


X 


,p> 


k 


2/i  \  |X 


fijk 


XjXk  \  „  ih 


X 


ft  +  X- ft 

2/i 


1 


X 


hjk 


XjXk 


X 


2/i  \  X 


j 


X 


2/i 


ft  — 

J  X 


(P  •  X) 


IX 


(18.35) 


It  is  now  a  simple  matter  to  compute  [Aj ,  H ]  by  combining  (18.34)  and 
(18.35)  and  verify  that  everything  cancels.  We  have,  for  example,  a  term 
involving  (Xj/  |X|3)(X  •  P  )  in  (18.34)  and  a  canceling  term  in  (18.35).  ■ 
Before  proceeding  with  the  remaining  results  concerning  the  Runge-Lenz 
vector,  we  verify  some  results  that  will  be  needed  later.  There  are  some 
quantum  corrections  compared  to  the  corresponding  classical  results. 

Lemma  18.20  As  in  the  classical  case,  the  following  “orthogonality”  re¬ 
lations  among  vector  operators  hold: 


J  P  =  P  J  =  0 

J  X  =  X  J  =  0 
(P  x  J)-J  =  J-(P  x  J)  =0. 


(18.36) 

(18.37) 

(18.38) 
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Meanwhile ,  there  is  a  quantum  correction  in  the  dot  product  between  P  and 

A 

P  x  J,  as  follows: 


Finally ,  we  have 


P  •  (P  x  J)  =  0 

(18.39) 

(P  x  J)  •  P  =  2ih(P  ■  P). 

(18.40) 

(P  x  J)  •  (P  x  J)  =  (P  •  P)(J  •  J) 

(18.41) 

X  (P  x  J)  =  J  J 

(18.42) 

(P  x  J)  •  X  =  J  •  J  +  2iKP  ■  X. 

(18.43) 

Proof.  By  (18.27)  and  (18.29),  we  have 

J  •  P  =  (X  x  P)  •  P  =  X  •  (P  x  P)  =  0, 


since  the  different  components  of  P  commute.  The  same  reasoning  shows 

A  A  A  _  A  A 

that  P  •  J,  J  •  X,  and  X  •  J  are  all  zero.  To  compute  (P  x  J)  •  J,  we  first 
use  (18.27),  then  use  (18.32),  and  then  use  that  P  •  J  =  0.  For  J  •  (P  x  J), 

_  A  A  _ 

we  rewrite  P  x  J  in  terms  of  J  X  P,  using  (18.31).  The  correction  term 
involves  P,  which  has  a  dot  product  of  zero  with  J,  and  so  the  answer  is 
again  zero. 

We  use  (18.27)  and  (18.29)  again  to  establish  (18.39).  To  get  (18.40),  we 
first  rewrite  P  x  J  in  terms  of  J  x  P  using  (18.31)  and  then  apply  (18.39). 
To  establish  (18.41),  we  apply  (18.27)  and  then  (18.30),  giving 

(P  x  J)  •  (P  x  J)  =  PjJkPjJk  —  PjJkPkJj •  (18.44) 

A 

The  second  term  on  the  right-hand  side  of  (18.44)  is  zero  because  J-P  =  0. 

a 

For  the  first  term,  we  move  Jk  to  the  right  past  P3 .  This  generates  the  term 
we  want  plus  a  correction  term  equal  to  ihskjiPjPiJk •  The  correction  term  is 
zero  because  P3  and  Pi  commute  and  Skji  is  changes  sign  under  interchange 
of  j  and  l.  The  identity  (18.42)  follows  immediately  from  (18.27)  and  the 
definition  of  J.  The  identity  (18.43)  follows  from  (18.27)  and  (18.28).  ■ 


Lemma  18.21  For  all  j  and  m,  we  have 


(P  x  3)j,  (P  X  J)m 


A. 

ih(P  •  P jml  Jl  • 


A  A 

Proof.  In  computing  [Pk  Ji,  Pn  J0],  we  use  repeatedly  the  product  rule  for 
commutators  (Point  3  of  Proposition  3.15).  We  obtain  four  terms,  one  of 
which  is  zero  (the  term  involving  [Pfc,Pn]).  We  use  Proposition  17.27  (in 
the  form  (18.26))  to  evaluate  all  remaining  terms,  giving 


i 

A  A 

[&jkl  Pk  Jli  ^mnoPnJ o 


A  A 


A  A 


—  £ jkl^mno  (  Pk  [Jl,Pn]Jo  +  PnPk[Ju  Jo]  +  Pn  [Pk  ?  Jo\Jl  •  (18.45) 
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Let  us  compute  the  first  of  the  three  terms  on  the  right-hand  side  of  (18.45) 
Using  Lemma  18.18  and  the  fact  that  P  is  a  vector  operator,  we  get 


/\  /\ 

£ jkl^mnoPk  \  Jli  Pn\Jo 


—  &jkl (3  opfiml  $  ol$mp)PkPpJ o 
£  jkmPkPpJp  &  jkoPkPmJ o 
£ jkmPk(P  '  J)  Pm(P  x  J)j 


If  we  compute  the  second  and  third  terms  similarly,  we  obtain 

i 

|  _  A  _  /\  _  _  .A.  _  _  A 

jklPkJl  ->  ^ mnoPnJo  =  & jkmPk  (P  J)  Pm( P  X  J)j 
+  (P  X  P )jJm  —  SjkmPk{ P  '  J)  +  Pm{ P  x  J )j  ~  (P  *  P 


_  A  _  _ 

Three  of  the  above  terms  are  zero  (those  involving  P  •  J  or  P  x  P)  and  two 
other  terms  cancel,  leaving  us  with 


1 

ih 


mno 


—  (P  •  P )SjmlJu 


as  claimed.  ■ 

We  now  continue  with  the  proof  of  the  properties  of  the  Runge-Lenz 
vector. 

Proof  Proposition  18.11.  From  the  first  set  of  orthogonality  relations  in 

A  .A.  A  .A. 

Lemma  18.20,  we  can  see  easily  that  J  •  A  =  A  •  J  =  0.  Meanwhile,  using 

A  A  A 

the  expression  (18.19)  for  A  and  expanding  out  A  •  A  yields,  after  a  little 
simplification, 


A  A 


1  + 
1 

t>Qr‘ 


l 


A  A 


k2QA 


(P  •  P)  J  •  J  +  h‘ 


23  ■  jT_  +ih\P  X 


A  A 


X 


X 


Now, 


Thus, 


X 

•p  P-  X  -ihi 

Skk 

Xk  xk\ 

1-2  ih  1 

X 

x  \ 

vx 

X2  X  J 

X 

A  A 


i  +  ((j  •  j) + ft2) 


2 

m<24 


as  claimed.  ■ 

A 

Proof  of  Lemma  18.12,  Second  Part.  We  write  A  in  the  form  given 
in  (18.19).  In  computing  the  commutator  of  Aj  with  Am,  we  get  several 
different  types  of  terms,  which  we  compute  one  at  a  time.  Of  course,  the 
commutator  of  X3  /  |X|  with  Xm/  |X|  is  zero.  The  commutator  of  the  P  x  J 
terms  has  been  computed  in  Lemma  18.21. 
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A 

Meanwhile,  to  compute  the  commutator  of  PkJi  with  Xm(l/|X|),  we 
again  get  four  terms  and,  again,  one  of  these  is  zero,  namely  the  one  in- 
volving  {Ji,  1/  |X|},  since  1/  |X|  is  invariant  under  rotations.  We  have,  then, 


1 

ih 


£jklPkJl,X 

m 


1 


£jkl[Pk,X 

m  )Ji 


X 

1  r 

1 

r  i  l 

1  ^  jklPk\Jl  ->  Xm 

1  £ jklXm 

a,  x  ^ 

Jr 


x  ~  1  ov1,  V  Xk  v  JJ 

—  £ jkl$km,Jl  4"~  E jkiEimnPkXn  —  H-  EjklXrn  ___  ^Ein0XnP0, 


X 


m  XI3 


If  we  apply  Lemma  18.18  and  carry  out  some  computations  similar  to  ones 
we  have  already  performed,  we  obtain 


.  1  1 

E  j ml  J l  |  v  4“  &jm(R  '  -X-) 


X 


X 


4“  Xq^Xj 


Xj 


m 


X 


+ 


X. 


m 


X 


p 


J 


(18.46) 


In  a  commutator  of  the  form  [a3  4-  /5y ,  am  4-  /3m],  the  terms  involving  the 
commutator  of  an  a  with  a  [3  will  be  [aq,/4m]  4-  which  is  equal 

to  [aq,/4m]  —  [am,/4j].  This  quantity  is  skew-symmetric  j  with  m,  meaning 
that  it  changes  sign  when  we  interchange  j  with  m.  Thus,  terms  in  (18.46) 
that  are  symmetric  in  j  and  m  will  disappear  when  we  compute  the  full 
commutator  of  Aj  with  Am.  Thus,  the  second  and  third  terms  in  (18.46) 
can  be  ignored.  In  the  last  term,  we  can  commute  Pm  past  Xj  to  obtain 


R 


m 


Xj  xrn 

X  X  j 


Xj  „ 
_ L  p 

X  ' 


m 


+ 


X. 


m 


Pi 

X  J 


ih 


5 


jm 

5T 


XjX. 


m 


X 


(18.47) 


which  is  also  symmetric.  Thus,  only  the  first  term  in  (18.46)  contributes  to 
the  computation  of  [A/,  Am\.  This  term  is  skew-symmetric  in  j  and  m  and 

A  /\ 

will  be  doubled  when  we  compute  [Aj,Am\. 

A 

Now,  it  is  straightforward  to  compute  [EjkiPk  Jh  Pm]  and  {Pj,Xm/  |X|] 

and  to  verify  that  these  commutators  are  symmetric  in  j  and  m  (Exercise  8) 

/\  /\ 

and  therefore  do  not  contribute  to  the  computation  of  [ Aj ,  Am].  We  are  left, 
then,  with  the  following 


m 


j2Q 


1  1 

— j£jml(P  '  P )Jl  H - 


jQ‘ 


/\ 

£jml  J l 


( 


P  P 


V  2/x 


^E  jml  J l 


1 


X 


which  is  what  is  claimed  in  the  lemma.  ■ 
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Proof  of  Theorem  18.14.  Since  the  Hamiltonian  H  is  invariant  under 

a 

rotations,  H  commutes  with  each  component  of  the  angular  momentum. 

a 

We  have  also  established  that  H  commutes  with  each  component  of  the 

A  A 

Runge-Lenz  vector.  From  this  it  follows  easily  that  I  and  K  commute  with 
the  Hamiltonian. 

A  /V 

Since  A &  commutes  with  H ,  it  also  commutes  with  any  function  of  H. 
It  then  follows  from  Lemma  18.12  that 


1 

ih 


[Bk,  Bt 


_  [iQ4  2  _  T  TJ 

~  2\H\  »Q4  jml  1 


A.  A 

Since  H/\H\  =  —I  on  the  negative-energy  subspace  V~ ,  the  above  expres¬ 
sion  reduces  to  SjmiJi •  (The  result  on  the  positive-energy  subspace  will 
differ  by  a  crucial  minus  sign  from  what  we  have  on  V~ .) 

A  A 

Meanwhile,  since  both  B  and  J  are  vector  operators,  we  have,  by  Propo- 

A  A  A  A  A  A 

sition  17.27,  (1  /(ih))[Bj,  Jk\  =  ejkiBi  and  Jk\  =  SjkiJi •  From 

A  A 

the  commutation  relations  among  the  s  and  J/s,  it  is  an  easy  calcula¬ 
tion  to  verify  the  claimed  commutation  relations  among  the  components  of 

A  A 

I  and  K.  ■ 


18.7  Exercises 


1.  Consider  the  quantum  Hamiltonian  for  two  particles  in  M3  interacting 
by  means  of  a  1/r  potential: 


2 


Here,  as  in  Sect.  3.11,  Ai  is  the  Laplacian  with  respect  to  the  variable 
x1  and  A2  is  the  Laplacian  with  respect  to  the  variable  x2.  As  in 
Sect.  2.3.3,  introduce  new  variables  consisting  of  the  center  of  mass, 
c  =  (mix1  +  m2X2)/(mi  +7712),  and  the  relative  position,  y  =  x1  — x2. 

A 

Show  that  H2  can  be  expressed  in  these  variables  as 


h2  .  h2  . 
— - -Ac - Ay 

2(mi  +  m2)  2  fi 


2 


1 


where  fi  is  the  reduced  mass,  given  by  /x  =  rrixm^l [jn\  +  m2). 

A 

Note :  In  the  new  variables,  H  is  the  sum  of  two  terms,  one  of  which  in¬ 
volves  only  the  variable  c  and  one  of  which  involves  only  the 
variable  y.  The  term  involving  only  c  is  the  Hamiltonian  for  a  free 
particle  with  mass  mi  -fm2,  whereas  the  term  involving  only  y  is  the 
Hamiltonian  for  a  particle  of  mass  /i  moving  in  a  1/r  potential. 
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2.  Let  H{x,  p)  =  |p|2  / (2/x)  -  Q2/  |x  denote  the  Hamiltonian  for  the 
classical  Kepler  problem  in  M3.  Show  that  for  every  e  >  0,  the  region 
in  M6  given  by  {(x,  p)  |i7(x,  p)  <  —  s}  has  finite  volume. 

3.  Let  HI  denote  the  real  span  of  the  following  four  elements  of  M2(C): 


1  := 


J  • 


1  0 

0  1 

0  1 

-1  0 


i  : 


i  0 
0  -i 

0  i 
i  0 


(a)  Show  that  HI  forms  an  associative  algebra  over  R,  under  the  op¬ 
eration  of  matrix  multiplication,  and  that  the  following  relations 
are  satisfied: 


ij  =  -Ji  =  k 
jk  =  -kj  =  i 
ki  =  — ik  =  j. 


The  algebra  HI  is  (one  particular  realization  of)  the  quaternion 
algebra. 

(b)  Show  that  each  nonzero  element  of  HI  has  a  multiplicative  in¬ 
verse. 

Hint :  Imitate  the  argument  that  each  nonzero  complex  number  has 
a  multiplicative  inverse. 

4.  Let  HI  denote  the  quaternion  algebra  defined  in  Exercise  3.  This  ex¬ 
ercise  establishes  explicitly  an  isomorphism  between  the  Lie  algebras 
so(4)  and  so(3)  ®  so(3)  (compare  Definition  16.14). 

(a)  Let  V  be  the  subspace  of  HI  spanned  by  i,  j,  and  k.  Show  that 
V  forms  a  Lie  algebra  under  the  bracket  [a,  /3]  =  a/3  —  /3a  and 
that  V  is  isomorphic  as  a  Lie  algebra  to  so(3). 

(b)  Let  End(HI)  denote  the  algebra  of  real-linear  maps  of  HI  to  it¬ 
self.  Given  a  E  V,  let  La  E  End(HI)  be  the  “left  multiplication 
by  a”  map,  La(f3)  =  a/3,  and  let  Ra  E  End(HI)  be  the  “right 
multiplication  by  a”  map,  Ra{fi)  =  /3a.  Show  that  the  maps 
a  4  la  and  a  ^  —  Ra  are  Lie  algebra  homomorphisms  of  V 
into  End  (HI). 

(c)  Consider  the  inner  product  on  HI  in  which  { 1 ,  i,  j ,  k}  forms  an 
orthonormal  basis.  Given  a  E  V,  show  that 

{LaP,l)  =  ~  {ft,  Lai) 

(Ra/3,  l)  =  ~  (P,  Ral)  ■ 
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(d) 

(e) 


That  is  to  say,  La  and  Ra  belong  to  so (4),  which  we  identify 
with  the  space  of  elements  of  End(M)  that  are  skew-symmetric 
with  respect  to  the  inner  product  in  Part  (c). 

Show  that  the  map  (a,  /?)  i— >>  La  —  Rp  is  a  Lie  algebra  isomor¬ 
phism  of  so(3)  ®  so(3)  to  so(4). 

Let  D  denote  the  diagonal  subalgebra  of  so(3)  ®  so(3),  that  is, 
the  set  of  elements  of  the  form  (X,  X).  Show  that  the  image  of 
D  under  the  isomorphism  in  Part  (d)  is  the  set  of  elements  Y  of 
so(4)  C  End(H)  having  the  following  form  with  respect  to  the 
basis  in  Part  (c): 


0  0  \ 

0  Z  J’ 


where  Z  E  so (3). 


5. 


Describe  explicitly  the  two  subalgebras  of  so (4)  corresponding  to  the 
two  copies  of  so(3)  in  the  isomorphism 

so(4)  =  so(3)  ®  so(3) 


in  Exercise  4. 


6.  Verify  Lemma  18.18. 

Hint :  First  show  that  EjkiSjmn  =  0  unless  (fc,Z)  =  (ra,  n)  or  (fc,Z)  = 
(n,  ra). 

7.  In  this  exercise,  we  use  the  summation  convention  of  Sect.  18.4.1. 


(a)  Show  that  for  any  3x3  matrix  M  and  any  indices  j,  /c,  l  E 
{1,2,3},  we  have 

£ mno  MjmMknMi0  =  Sjki  (detM). 

(b)  Show  that  if  C  is  a  vector  operator,  then  for  all  R  E  SO (3),  we 
have 

=  RlkCi. 

(c)  Show  that  the  cross  product  of  two  vector  operators  is  a  vector 
operator. 

Hint :  Write  the  definition  of  a  vector  operator  in  the  equivalent 
form 

V  •  C  =  U(R)((R-\)  ■  C)n (i?)"1. 


8.  Compute  [ejkiPk  Ji,  Pm\  and  [Pj,Xm/  |X|]  and  show  that  both  of 
these  quantities  are  symmetric  in  j  and  m,  meaning  that  the  value  is 
unchanged  if  we  interchange  j  and  ra. 


9.  Show  that  the  Eq.  (18.14)  has  two  power  series  solutions  for  g(p),  one 
starting  with  and  one  starting  with  p°. 


19 

Systems  and  Subsystems, 
Multiple  Particles 


19.1  Introduction 

Up  to  this  point,  we  have  considered  the  state  of  a  quantum  system  to 
be  described  by  a  unit  vector  in  the  corresponding  Hilbert  space,  or  more 
properly,  an  equivalence  class  of  unit  vectors  under  the  equivalence  relation 
fj  ^  e We  will  see  in  this  section  that  this  notion  of  the  state  of  a 
quantum  system  is  too  limited.  We  will  introduce  a  more  general  notion 
of  the  state  of  a  system,  described  by  a  density  matrix.  The  special  case 
in  which  the  system  can  be  described  by  a  unit  vector  will  be  called  a 
pure  state. 

One  way  to  see  the  inadequacy  of  the  notion  of  state  as  a  unit  vector  is 
to  consider  systems  and  subsystems.  We  will  examine  this  topic  in  greater 
detail  in  Sect.  19.5,  but  for  now  let  us  consider  the  example  of  a  system  of 
two  spinless  “distinguishable”  particles  moving  in  M3.  (For  now,  the  reader 
need  not  worry  about  the  notion  of  distinguishable  particles;  just  think  of 
them  as  being  two  different  types  of  particles,  with,  say,  different  masses 
or  charges.)  Let  us  assume  the  combined  state  of  the  two  particles  can  be 
described  by  a  unit  vector  in  the  corresponding  Hilbert  space,  which  is 
(according  to  Sect.  3.11)  L2(M6).  We  have,  then,  a  wave  function  ^(x,y), 
where  x  is  the  position  of  the  first  particle  and  y  is  the  position  of  the 
second  particle. 

Given  a  wave  function  ^(x,y)  for  the  combined  system,  what  is  the 
wave  function  describing  the  state  of  the  first  particle  only?  If  the  wave 
function  of  the  combined  system  happens  to  be  a  product,  say,  ^(x,  y)  = 
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ipi(x.)ip2(y),  then,  naturally,  we  would  say  that  the  state  of  the  first 
particle  is  simply  pj i-  Of  course,  one  might  object  that  we  could  rewrite 
pj  as  ^(x,  y)  =  [cpJipx)][ip2py) /c\  for  any  constant  c,  but  this  only  affects 
the  wave  function  for  the  first  particle  by  a  constant,  which  does  not  affect 
the  physical  state. 

In  general,  however,  the  wave  function  of  the  combined  system  need 
not  be  a  product.  Already  when  pj  is  a  linear  combination  of  two  prod¬ 
ucts,  -0(x,  y)  =  ipipx)ip2py)  +  0i(x)^2(y),  it  is  unclear  what  the  correct 
wave  function  is  for  the  first  particle.  At  first  glance,  it  might  seem  nat¬ 
ural  to  try  V’lfx)  +  pi  (x),  but  upon  closer  examination,  this  is  not  an 
unambiguous  proposal.  After  all,  we  can  just  as  well  write  -0(x,  y)  = 
[ci'0i(x)]['02(y)/ci]  +  [c20i(x)][02(y)/c2],  but  then  the  resulting  wave  func¬ 
tions  for  the  first  particle,  ipi (x)  +  ^(x)  and  c\pj\{x)  +  02^2 (x),  are  not 
scalar  multiples  of  one  another.  For  a  general  unit  vector  ip  in  L  2(M6),  the 
situation  is  even  worse.  The  conclusion  is  this:  There  does  not  seem  to  be 
any  way  to  associate  to  pj  a  general  unit  vector  ip'  in  L2(M3)  such  that  ip' 
could  sensibly  be  described  as  “the  state  of  the  first  particle.” 

Although  we  cannot  associate  with  ip  a  wave  function  pj'  for  the  first 
particle,  there  is  no  difficulty  in  taking  expectation  values  of  observables 
related  to  the  first  particle.  We  can  make  perfect  sense  of,  say,  the  expected 
position  of  the  first  particle,  as 


w(1k 


x. 


^(x,y)|  hx  dy. 


Here  indicates  the  operator  of  multiplication  by  the  jth  component 
of  the  first  vector  in  the  function  ' ip( •,  •)  :  M3  x  M3  — >•  C.  That  is  to  say, 
the  operator  X3  acting  on  L2(M3)  can  be  “promoted”  to  an  operator  on 
L2(M6)  by  having  it  act  in  the  first  variable  only.  Similarly,  the  momentum 

operator  P}  on  L2(M3)  can  be  promoted  to  an  operator  P on  L2(M6), 

by  letting  it  act  on  the  first  variable,  meaning  that  P^pj  is  —ih  times  the 
partial  derivative  with  respect  to  the  jth  component  of  the  first  vector  in 
In  fact,  as  we  will  see  in  Sect.  19.5,  given  any  self-adjoint  operator 
on  L2(M3),  there  is  a  natural  way  to  promote  it  into  an  operator  on  L2(M6), 
where  its  expectation  value  may  then  be  defined. 

Thus,  although  there  is  no  natural  way  to  associate  with  a  unit  vector 
pj  in  L2(M6)  a  unit  vector  in  L2(M3),  there  is  a  natural  way  to  associate 
with  pj  expectation  values  of  observables  on  L2(M3).  This  suggests  that  we 
should  introduce  a  more  general  notion  of  the  “state”  of  a  quantum  system, 
a  notion  in  which  with  each  “reasonable”  family  of  expectation  values  for 
the  quantum  observables  there  is  associated  a  quantum  state.  This  notion 
turns  out  to  be  that  of  density  matrices  (positive,  self-adjoint  operators 
with  trace  1). 

In  Sect.  19.3,  we  introduce  the  notion  of  a  density  matrix.  Theorem  19.9 
in  that  section  will  tell  us  that,  given  any  reasonable  assignment  (p  of 
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expectation  values  to  observables,  there  is  a  unique  density  matrix  p  such 
that  cj)(A )  =  trac e(pA)  for  all  observables  A.  In  the  special  case  in  which 
the  state  of  the  system  is  given  by  a  unit  vector  i/j  in  the  Hilbert  space, 
then  p  will  be  just  the  projection  onto  if  and  trac e(pA)  will  be  equal  to 
the  familiar  expression  (if,  Aif) .  In  Sect.  19.5,  we  will  consider  composite 
quantum  systems  and  introduce  a  method  (the  partial  trace)  of  defining  a 
density  matrix  for  a  subsystem  from  a  density  matrix  for  the  whole  sys¬ 
tem.  Finally,  in  Sect.  19.6,  we  will  consider  the  important  special  case  of 
composite  systems  made  up  of  multiple  identical  particles. 


19.2  Trace-Class  and  Hilbert-Schmidt  Operators 

In  this  section,  we  explore  notions  related  to  the  trace  of  an  operator  on  a 
Hilbert  space.  The  results  of  this  section  are  presented  without  proof;  see 
Chap.  VI  in  Volume  I  of  [34]  for  proofs  and  additional  information. 

Proposition  19.1  Suppose  A  E  B{  H)  is  non-negative  and  self-adjoint. 
Then  for  any  two  orthonormal  bases  {ey }  and  {fj}  for  H,  we  have 

V  (ei’Aei)  =  V  ■ 

3  3 

Note  that  since  A  is  non- negative,  (ej,  Aef)  and  ( fj ,  Afj)  are  non-negative 
real  numbers.  Thus,  the  sums  are  always  well  defined,  but  may  have  the 
value  of  -foo. 

Definition  19.2  If  A  £  B( H)  is  non-negative  and  self-adjoint,  the  value 
°fJ2j  (eji  Aef) ,  for  any  arbitrarily  chosen  orthonormal  basis,  is  called  the 
trace  of  A.  //trace(H)  <  +oo,  then  we  say  that  A  is  trace  class. 

For  a  general  A  E  6(H),  we  say  that  A  is  trace  class  if  the  non-negative 
self-adjoint  operator  \J A*  A  is  a  trace  class. 

Note  that  for  any  A  E  6(H),  A*  A  is  self-adjoint  and  non-negative.  Thus, 
the  square  root  of  A*  A  may  be  defined  by  the  functional  calculus  (Defini¬ 
tion  7.13  or  Proposition  8.4). 

Proposition  19.3 

1.  If  A  e  6(H)  is  trace  class,  then  for  any  orthonormal  basis  {ej},  the 
sum  (ej ,  Aej)  is  absolutely  convergent.  Furthermore,  the  value  of 
this  sum,  which  we  denote  as  trace(H),  is  independent  of  the  choice 
of  orthonormal  basis. 

2.  If  A  £  6(H)  is  trace  class,  then  A *  is  also  trace  class  and 


trace(H*)  =  trace(H). 


422 


19.  Systems  and  Subsystems,  Multiple  Particles 


3.  If  A.  E  /3(H)  xs  trace  class ,  thcTi  foe  all  B  €  B{ H),  t/ie  operators  AB 
and  BA  are  also  trace  class ,  and 

trac  e(AB)  =  trac  e(BA). 


Recall  that  A  E  B( H)  is  said  to  be  compact  if  A  maps  every  bounded 
set  in  H  to  a  set  with  compact  closure.  If  a  self-adjoint  operator  A  is  trace 
class,  it  is  necessarily  compact  and  thus  has  an  orthonormal  basis  {ej}  of 
eigenvectors,  for  which  the  associated  eigenvalues  A j  are  real  and  tend  to 
zero  as  j  tends  to  infinity.  (See  Theorem  VI.  16  in  Volume  I  of  [34].  One  can 
deduce  the  result  from,  say,  the  direct  integral  form  of  the  spectral  theorem 
for  bounded  self-adjoint  operators  by  verifying  that  unless  A  has  point 
spectrum  with  eigenvalues  tending  to  zero,  the  operator  of  multiplication 
by  A  in  the  direct  integral  will  not  be  compact.)  Point  1  of  Proposition  19.3 
then  tells  us  that  JV  |  Ay  |  <  oo  and  that  trace(A)  =  JV  A  j.  Conversely,  if 
A  is  a  self-adjoint  operator  having  an  orthonormal  basis  of  eigenvectors  for 
which  the  associated  eigenvalues  satisfy  JV  |  Ay  |  <  oo,  then  A  is  trace  class. 


Definition  19.4  An  operator  A  E  B( H)  is  said  to  be  Hilbert- Schmidt 
if  trace  (A*  A)  <  oo. 


Since  A* A  is  self-adjoint  and  non-negative,  trace(A*A)  is  defined  (but 
possibly  infinite)  for  any  A  E  B( H).  If  A  is  trace  class,  then  (by  definition) 
the  trace  of  \J A*  A  is  finite,  in  which  case,  the  trace  of  V A*  Ay/ A*  A  is  also 
finite,  by  Point  3  of  Proposition  19.3.  Thus,  every  trace-class  operator  is 
Hilbert-Schmidt  (but  not  vice  versa). 


Proposition  19.5  If  A  £  6(H)  is  Hilbert-Schmidt ,  so  is  A *.  If  A,  B  E 
B(  H)  are  Hilbert-Schmidt,  then  AB  and  BA  are  trace  class  and  trace  (AB) 
equals  trac e(BA). 


If  A  and  B  are  Hilbert-Schmidt  operators,  the  Hilbert-Schmidt  inner 
product  of  A  and  B  is  (A,B)HS  :=  trac e(A*B)  and  the  Hilbert-Schmidt 

norm  of  A  satisfies  ||AL||#s  =  (A,A)HS.  The  space  of  Hilbert-Schmidt 
operators  is  a  Hilbert  space  with  respect  to  (•,  •) HS  . 


19.3  Density  Matrices:  The  General  Notion 
of  the  State  of  a  Quantum  System 

Typically,  we  think  of  the  quantum  observables — the  ones  with  expecta¬ 
tions  values  that  we  wish  to  take — as  being  unbounded  self-adjoint  oper¬ 
ators.  But  of  course  we  can  also  take  expectation  values  of  bounded  self- 
adjoint  operators,  and  indeed  expectations  for  bounded  operators  deter¬ 
mine  those  for  unbounded  operators.  After  all,  suppose  A  is  an  unbounded 
self-adjoint  operator  and  suppose  we  know  the  expectation  value  for  1  e(A) 
for  every  Borel  set  E  C  R,  where  1#  is  the  indicator  function  of  E  and 
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1  e(A)  is  defined  by  the  functional  calculus  (Definition  7.13).  The  expec¬ 
tation  value  for  1e(A)  is  the  probability  of  obtaining  a  value  in  E  for  a 
measurement  of  the  observable  A.  If  we  know  this  probability  for  each  E , 
then  we  know  the  full  probability  distribution  of  the  measurements,  and 
thus  we  can  compute  the  expectation  value  of  A.  Furthermore,  we  can 
always  introduce  expectation  values  for  (bounded)  non-self- adjoint  opera¬ 
tors.  Each  such  operator  A  is  of  the  form  A  =  A\  +  iA 2  with  A\  and  A2 
self-adjoint,  and  so  we  may  reasonably  define  the  expectation  value  of  A  to 
be  the  expectation  value  of  A\  plus  i  times  the  expectation  value  of  A 2. 

We  then  postulate  that  the  general  notion  of  the  “state”  of  a  quantum 
system  should  be  simply  a  “list”  of  expectation  values  for  all  bounded 
operators,  satisfying  some  reasonable  hypotheses. 


Definition  19.6  A  linear  map  :  6( H)  — »  C  is  a  family  of  expectation 
values  if  the  following  conditions  hold. 

1.  $(/)  =  1. 

2.  &(A)  is  real  whenever  A  is  self-adjoint. 

3.  &(A)  >  0  whenever  A  is  self-adjoint  and  non-negative. 

4.  For  any  sequence  An  in  6(H),  if  ||7Ln^  —  Aip ||  0  for  all  ip  G  H, 

then  <h(Hn)  &(A). 


Point  4  in  the  definition  says  that  <f>  is  continuous  with  respect  to  the 
strong  (sequential)  convergence  in  6(H).  By  Exercise  3,  any  linear  map 
on  6(H)  satisfying  Points  1,  2,  and  3  is  automatically  continuous  with 
respect  to  the  operator  norm  topology,  meaning  that  if  \\An  —  A ||  0 

then  <h(Hn)  — >•  <&(A).  However,  to  establish  our  characterization  of  families 
of  expectation  values  in  terms  of  density  matrices,  we  need  continuity  of 
<f>  under  a  more  general  sort  of  convergence,  where  we  only  assume  that 


||Ant/>  —  Aip ||  0  for  each  ip.  This  stronger  continuity  property  does  not 

follow  from  Properties  1-3.  Exercise  5  gives  an  example  of  a  linear  func¬ 
tional  on  6(H)  that  satisfies  Points  1-3  of  Definition  19.6,  but  not  Point  4. 


Definition  19.7  An  operator  p  E  6(H)  is  a  density  matrix  if  p  is  self- 
adjoint  and  non-negative  and  trace(p)  =  1. 


Of  course,  since  the  trace  of  a  density  matrix  is  assumed  to  be  finite,  every 
density  matrix  is  trace  class.  The  next  two  results  give  a  precise  characteri¬ 
zation  of  families  of  expectation  values  in  terms  of  density 
matrices. 


Proposition  19.8  Suppose  p  is  a  density  matrix  on  H.  Then  the  map 
:  6(H)  C  given  by 

4 yp(A)  =  trace(pH)  =  trace(Hp) 

is  a  family  of  expectation  values. 
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Proof.  If  we  define  &P(A)  =  trace(pA),  then  4>p(/)  =  trace(p)  =  1.  For 
any  A  G  6(H),  we  have, 


trace(pA*)  =  trace(M*p)  =  trace((pA)*)  =  trace(pA). 


It  follows  that  trace(pA)  is  real  when  A  is  self-adjoint.  Let  p1/2  be  the  non¬ 
negative  self-adjoint  square  root  of  p.  Then  p1/2  and  Ap1/2  are  Hilbert- 
Schmidt  (in  the  latter  case,  by  Point  3  of  Proposition  19.3).  It  follows  that 
trace(Mp1/2p1//2)  =  trace(p1/2Mp1/2),  by  Proposition  19.5.  Thus,  if  A  is 
self-adjoint  and  non-negative, 

trace(pA)  =  trace(p1^2p1^2M)  =  trace(p1^2Mp1^2)  >  0,  (19.1) 


because  p1/2 Apl/ '2  is  self-adjoint  and  non-negative.  We  have  established 
that  satisfies  Points  1,2,  and  3  of  Definition  19.6. 

Meanwhile,  suppose  Anip  converges  in  norm  to  Aip ,  for  each  ip  in  H. 
Then  ||An?/;||  is  bounded  as  a  function  of  n  for  each  fixed  f).  Thus,  by  the 
principle  of  uniform  boundedness  (Theorem  A. 40),  there  is  a  constant  C 
such  that  ||An||  <  C.  Now,  if  {ey }  is  an  orthonormal  basis  for  H,  we  have 


'3i 


P 


1/2  Any/ 2 


■  A  W2 

j  •>  sT-nP 


<  c 


y/2e 


J 


2 

•) 


(ej,  pef)  =  trace(p)  <  oo. 
j 


Furthermore,  since  An(pl/2ej)  converges  to  A(p1^2ej)  for  each  j,  dominated 
convergence  tells  us  that 


trace(p1/2Ap1/2)  =  px^2 Apx^2e^ 


3 


=  lim  ^2/ej,p1/2Anp1/2 

n— ^ oo  z J  \ 


=  lim  trace(p1^2Anp1/2). 


n— >•  oo 


As  in  (19.1),  we  can  shift  the  second  factor  of  p1/2  to  the  front  of  the  trace 
to  obtain  Point  4  in  Definition  19.6.  ■ 


Theorem  19.9  For  any  family  of  expectation  values  <f>  :  6(H)  -4  C,  there 
is  a  unique  density  matrix  p  such  that  T (A)  —  trace(pM)  for  all  A  G  B(  H). 


Proof.  Recall  from  Sect.  3.12  the  Dirac  notation,  in  which  the  expression 
<p)(ip  |  denotes  the  linear  operator  taking  any  vector  \  G  H  to  the  vec¬ 
tor  \(/>){ip\x)  (in  physics  notation),  that  is,  the  vector  (fip,x)(t)  (in  math 
notation).  If  p  is  trace  class,  then  by  Exercise  2, 


trace(p  |</>)(^|)  = 
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Thus,  if  an  operator  p  with  the  desired  properties  is  to  exist,  we  must  have 


{i>,p<t>)  =  $(IV>XV’D- 


Now,  by  Exercise  3,  <f>  satisfies  ||<f>(A)||  <  \\A\\  .  From  this,  we  can  see 
that  the  map 


£$(</>>  VO  :=  ^(IV’XV’I) 


is  a  bounded  sesquilinear  form,  so  that  (by  Proposition  A. 63),  there  is 
a  unique  bounded  operator  p  such  that  <f>  ( 1 (p)pip 1 )  =  pip,p(p)  for  all  <p 
and  'ip.  Since  \(p){(p\  is  self-adjoint  and  non-negative,  (</>,</>)  is  real  and 
non-negative,  which  means  that  p  is  self-adjoint  (by  Proposition  A. 63)  and 
non- negative. 

Meanwhile,  if  {ey  }  is  an  orthonormal  basis  for  H,  then  by  Definition  19.2, 


N 


trace(p)  =  lim  V'(ej,pej) 

iV— oo  — 


3  = 1 


=  lim  4>  ( I e  i ){ e  1 1  + 

TV— )>oo 

=  $(/)  =  i. 


+  lejvXejvl) 


In  passing  from  the  second  line  to  the  third,  we  have  used  Point  4  of 
Definition  19.6.  Thus,  p  is  a  density  matrix. 

We  have  now  found  a  density  matrix  p  such  that  ( |  (p)pip  | )  agrees  with 
trace(p  \  <p)(ip\)  for  all  By  linearity,  <&(A)  =  trac e(pA)  for  all  finite- 

rank  operators  A  (see  Exercise  4).  Now,  if  {ej}  is  an  orthonormal  basis  for 
H,  let  P/v  be  the  orthogonal  projection  onto  the  span  of  ei, . . . ,  ey.  Then 
for  any  A  E  B(  H),  the  operator  Pn  A  has  finite  rank  and  P^Apj  — x  Aip  for 
all  pj  E  H.  Thus,  for  all  A  E  B( H), 

<f>(A)  =  lim  <I>(P/vA)  =  lim  trace(pP/vA)  =  trace(pA), 

TV— )>oo  N^-oo 

by  Proposition  19.8  ■ 

Our  next  result  shows  that  our  new  notion  of  the  state  of  a  system 
includes  our  old  notion. 


Proposition  19.10  For  any  unit  vector  ip  E  H,  let  \ip)pip\  >  in  accordance 
with  Notation  3.29,  denote  the  orthogonal  projection  onto  the  span  of  ip. 
Then  I^X^I  * s  a  density  matrix  and  for  all  A  E  B(  H),  we  have 


tra,ce(\ip)(ip\  A)  =  pip,  Aip) . 


Note  that  if  ip2 


=  el0ipi  then 


ip2)pip2\  •  Thus,  from  our  new 


V’iXV’i 

point  of  view,  we  may  say  that  the  reason  ipi  and  ip2  represent  the  same 
“physical  state”  is  that  they  determine  the  same  density  matrix. 

Proof.  Since  it  is  an  orthogonal  projection,  \ip)(ip\  is  bounded,  self-adjoint, 
and  non-negative.  To  compute  its  trace,  we  choose  an  orthonormal  basis 
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{ej}  for  H  with  e\  =  jj,  which  gives  trace(|^)(^|)  =  1.  Using  the  same 
orthonormal  basis,  we  compute  that,  for  any  A  e  8(H), 

traceOV’XV’l  ^)  =  E  (ei>  V>)(V  Aej)  =  (ip,Aip) , 

3 


as  desired.  ■ 

Definition  19.11  A  density  matrix  p  E  B( H)  is  a  pure  state  if  there 
exists  a  unit  vector  E  H  such  that  p  is  equal  to  the  orthogonal  projection 
onto  the  span  of  ip.  The  density  matrix  p  is  called  a  mixed  state  if  no 
such  unit  vector  fj  exists. 

An  isolated  system  that  is  in  a  pure  state  initially  will  remain  in  a  pure 
state  for  all  later  times,  since  the  initial  state  ipo  evolves  to  the  pure  state 
e-iHt/h^ o,  wpere  pp  is  the  Hamiltonian  for  the  system.  But  if  a  system  is 
interacting  with  its  environment,  then  as  discussed  in  Sect.  19.5,  the  system 
may  move  into  a  mixed  state  at  a  later  time. 

There  are  several  different  ways  of  characterizing  the  pure  states  as  a 
subset  of  the  density  matrices.  First,  it  is  not  hard  to  see  (Exercise  6)  that 
a  density  matrix  p  is  a  pure  state  if  and  only  if  trace(p2)  =  1.  Second,  the 
set  of  density  matrices  is  a  convex  set,  since  if  pi  and  p2  are  non-negative 
and  have  trace  1,  then  so  is  \pi  +  (1  —  X)p2 ,  for  0  <  A  <  1.  According  to 
Exercise  7,  the  pure  states  are  precisely  the  extreme  points  of  this  set.  That 
is,  a  density  matrix  p  is  a  pure  state  if  and  only  if  it  cannot  be  expressed 
as  p  =  Xpi  +  (1  —  X)p2  where  pi  and  p2  are  distinct  density  matrices  and 
A  belongs  to  (0, 1).  Third,  we  may  define  the  von  Neumann  entropy  S(p) 
of  a  density  matrix  p  by 


S(p)  =  trace(— plogp), 


where  plogp  is  defined  by  the  functional  calculus.  (Since  limA_s>0+  A  log  A  = 
0,  we  interpret  OlogO  as  being  0.)  Since  the  eigenvalues  of  p  are  all  be¬ 
tween  0  and  1,  we  see  that  —plogp  is  a  non-negative  self-adjoint  operator, 
which  has  a  well-defined  trace,  which  may  have  the  value  -boo.  According 
to  Exercise  8,  a  density  matrix  p  is  a  pure  state  if  and  only  if  S(p)  =  0. 

Suppose  that  we  have  two  pure  states,  coming  from  unit  vectors  ipi  and 
ip2  •  Then  there  are  two  different  senses  in  which  we  can  take  a  superposition, 
that  is,  linear  combination,  of  the  corresponding  quantum  states.  If  we  use 
our  old  point  of  view,  in  which  the  states  are  vectors  in  H,  then  we  may  take 
the  linear  combination  Ci'ipi  +02^2,  and  then  normalize  this  vector  to  be  a 
unit  vector.  If  we  use  our  new  point  of  view,  in  which  the  states  are  density 
matrices,  then  we  may  take  the  linear  combination  c\  | V;i)(V;i  I  +c2  , 

where  in  this  case  c\  and  C2  should  be  non-negative  and  should  add  to  1. 
These  two  notions  of  superposition  are  different,  since 


C\ci4>i  +  c2i>2)(cii>i  -1-  C2V2 1 7^  ci  IX’iXV’il  +  C2  IX^XV^ 


(19.2) 
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no  matter  how  the  constant  C  is  chosen.  After  all,  the  state  on  the  left- 
hand  side  of  (19.2)  is  a  pure  state,  whereas  (unless  if  2  is  a  multiple  of  if  1), 
the  state  on  the  right-hand  side  of  (19.2)  is  a  mixed  state,  since  the  range 
of  this  operator  is  2-dimensional  rather  than  1-dimensional. 

Physicists  call  the  first  sort  of  superposition  (in  which  we  take  a  linear 
combination  of  vectors  in  H)  coherent  superposition  or  quantum  superpo- 
sition ,  and  they  call  the  second  sort  of  superposition  (in  which  we  take  a 
linear  combination  of  the  associated  density  matrices)  incoherent  superpo¬ 
sition.  The  reason  for  the  term  “coherent”  is  that  coherent  superposition 
depends  on  the  phases  of  the  coefficients.  That  is,  if  if  1  and  if  2  are  linearly 
independent,  the  vector  ciet9ipi  +  C2eL^if2  does  not  represent  the  same 
quantum  state  as  c\if\  -j-  02^2,  unless  eL°  =  e1^ .  By  contrast,  the  density 
matrix  associated  with  el9if  is  the  same  as  the  density  matrix  associated 
with  if,  and  so  the  phases  have  no  effect  when  taking  linear  combinations 
of  the  density  matrices  associated  to  vectors  in  H.  When  taking  a  coher¬ 
ent  superposition,  there  is  no  simple  relationship  between  the  expectation 
value  of  an  observable  in  the  states  ipi  and  if  2  and  the  expectation  value 
of  the  same  observable  in  the  state  C\if\  +  02^2 •  On  the  other  hand,  when 
taking  an  incoherent  superposition,  expectation  values  in  the  new  state  are 
just  linear  combinations  of  the  original  expectation  values: 


trace  ((ci  |^i)(^i|  +  c2  IM) 


ci  (ipi,Aipi)  +  c2  (ip2 ,Aip2) . 


19.4  Modified  Axioms  for  Quantum  Mechanics 

We  may  now  modify  the  axioms  of  quantum  mechanics  introduced  in 
Sect.  3.6  to  incorporate  density  matrices,  beginning  with  our  revised  no¬ 
tion  of  a  state. 

Axiom  6  The  state  of  a  quantum  system  is  described  by  a  density  matrix  p 
on  an  appropriate  Hilbert  space  H.  If  A  is  any  bounded  operator  on  H,  the 
expectation  value  of  A  in  the  state  p  is  given  by  the  quantity  trac e(pA)  = 
trac  e(Ap). 

In  Axiom  6,  we  assume  that  A  is  bounded,  so  that  trace(pA)  and  trace(Ap) 
are  defined  and  equal  by  Proposition  19.3.  If  A  is  unbounded  and  self- 
adjoint,  we  can  construct  a  probability  measure  p^f  describing  the  proba¬ 
bilities  for  measurements  of  A  in  the  state  p,  by  the  formula 

nf{E)  =  trace(/9lsM)), 

where  1  e(A)  is  defined  by  the  functional  calculus. 

We  then  define  the  expectation  value  of  A  in  the  state  p  as  fR  A  dp^( A), 
provided  the  integral  is  absolutely  convergent.  If  the  integral  is  absolutely 
convergent,  it  is  reasonable  to  hope  that  both  pA  and  Ap  will  be  densely 
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defined  and  bounded,  that  (the  bounded  extension  to  H  of)  these  operators 
will  be  trace  class,  and  that  both  trace(pA)  and  trace(Ap)  will  coincide  with 
fR  X  d/jL^(X).  We  will  not,  however,  enter  into  an  investigation  of  this  issue. 

Next,  we  propose  a  variant  of  Axiom  4,  describing  the  “collapse  of  the 
wave  function.” 


Axiom  7  Suppose  a  quantum  system  is  initially  in  a  state  p  and  a  mea¬ 
surement  of  a  self-adjoint  operator  A  with  point  spectrum  is  performed.  If 
the  measurement  results  in  the  value  X  for  A ,  then  immediately  after  the 
measurement,  the  system  will  be  in  the  state  p' ,  where 


1 

~Z 


P\pP\. 


Here  P\  is  the  orthogonal  projection  onto  the  X-eigenspace  of  A  and  Z  = 
trace  (P\pP\). 


Note  that  if  p  is  non-negative,  self-adjoint,  and  trace  class,  then  P\pP\ 
is  also  non-negative,  self-adjoint,  and  trace  class.  Implicit  in  Axiom  7  is 
the  assumption  that  the  measurement  can  only  result  in  values  A  for  which 
P\pP\  is  nonzero.  In  particular,  A  must  be  an  eigenvalue  for  A. 

Finally,  we  introduce  the  notion  of  time-evolution  for  our  new  notion  of 
“state.” 


Axiom  8  The  time  evolution  of  the  state  of  the  system  is  described  by  the 
following  equation  for  a  time- dependent  density  matrix  p(t): 


dp 

dt 


(19.3) 


This  equation  may  be  solved,  formally,  by  setting 

_  —itH  /  H 

—  e  I 

where  po  is  the  state  of  the  system  at  time  t  =  0. 


p(t)  =  e-nH'np0eitA/n. 


(19.4) 


There  are  some  domain  issues  involved  in  the  interpretation  of  the  equa¬ 
tion  (19.3).  Rather  than  entering  into  an  examination  of  those  issues  here, 
we  will  simply  take  (19.4)  as  the  definition  of  the  time-evolution  of  a  den¬ 
sity  matrix.  Presumably,  if  po  is  nice  enough,  then  the  map  t  i— >>  p(t)  will  be 
differentiable  as  a  curve  in  the  Banach  space  6(H)  and  its  derivative  will 
be  (an  extension  of)  the  operator  on  the  right-hand  side  of  (19.3).  By  com¬ 
parison,  it  follows  from  Stone’s  theorem  and  Lemma  10.17  that  the  family 
of  pure  states  fj(t)  :=  e~ltH'hripo  satisfies  the  Schrodinger  equation  in  the 
natural  Hilbert  space  sense  if  and  only  if  belongs  to  the  domain  of  H. 
To  see  that  the  time-evolution  in  (19.4)  is  consistent  with  the  previously 
defined  time-evolution  of  pure  states,  observe  that 


-itH/h 


V’oXV’o 


, itH/h  _ 


e-itH/h^e-itH/h^\  = 


-itH/h 


since  (e2tW^)* 


=  e 
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It  should  be  noted  that  (19.3)  differs  by  a  minus  sign  from  the  time- 
evolution  in  the  Heisenberg  picture  of  quantum  mechanics  (Definition  3.20). 
Although  this  difference  may  seem  strange,  keep  in  mind  that  in  Axiom  8, 
we  are  not  adopting  the  Heisenberg  point  of  view,  in  which  the  states 
are  independent  of  time  and  the  observables  evolve  in  time.  Rather,  we 
are  adopting  a  modified  version  of  the  Schrodinger  picture,  in  which  it 
is  the  states  that  evolve  in  time,  but  where  the  states  are  now  certain 
sorts  of  operators.  Even  though  both  the  states  and  the  observables  are 
now  operators,  the  observables  (in  the  Heisenberg  picture)  and  the  states 
(in  the  Schrodinger  picture)  must  evolve  in  opposite  directions  in  time,  in 
order  for  the  expectation  values  of  the  observables  to  be  the  same  in  the 
two  pictures. 


19.5  Composite  Systems  and  the  Tensor  Product 


As  discussed  in  Sect.  3.11,  the  Hilbert  space  for  two  (nonidentical,  spinless) 
particles  moving  in  M3  is  L2(M6).  Given  a  unit  vector  (i.e.,  a  pure  state) 

ip  in  L2(M6),  the  quantity  ^(x^x2)]  represents  the  joint  probability  dis¬ 
tribution  for  the  position  x1  of  the  first  particle  and  the  position  x2  of 
the  second  particle.  The  following  result  shows  that  L2(M6)  is  naturally 
isomorphic  to  the  Hilbert  tensor  product  of  two  copies  of  the  Hilbert  space 
for  the  individual  particles,  namely  L2(M3). 


Proposition  19.12  Suppose  that  (Xi,/ii)  and  (X 2,112)  are  cr -finite 
measure  spaces.  Then  there  is  a  unique  unitary  map 


p  :  A2(Xi,  /ii)(g)A2(A2,  M2)  L2(Xi  x  X2,/ii  x  /i2) 


such  that 

P(<p®^)(x,y)  =  <p{x)tp(y) 
for  all  cp  G  L2(Xi,/ii)  and  fj  G  L2(X2,/i2). 

Here  G  denotes  the  Hilbert  tensor  product  defined  in  Appendix  A. 4. 5. 
Proof.  For  simplicity  of  notation,  we  suppress  the  dependence  of  L2  spaces 
on  the  measure,  writing,  say,  L2(X  1)  rather  than  L2(X  1,  fif).  Consider  first 
the  algebraic  (i.e.,  uncompleted)  tensor  product  L2(Ai)G)L2(A2).  Using  the 
universal  property  of  tensor  products,  we  can  construct  a  linear  map  p  of 
L2(Xi)  ®  L2(X2)  -t  L2[X\  x  X2)  determined  uniquely  by  the  requirement 
that 

v0>  ®  tp){x,  y)  =  (p(x)ip(y). 

Now,  every  element  of  the  algebraic  tensor  product  L2(X i)  0  L2(X 2)  can 
be  expressed  as  a  linear  combination  of  elements  of  the  form  (pj  G  ipj ,  with 
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pj  G  L2(X i)  and  pj  in  L2(X 2).  By  computing  on  such  linear  combina¬ 
tions,  we  can  easily  verify  that  p  is  isometric.  Thus,  by  the  bounded  linear 
transformation  (BLT)  theorem  (Theorem  A. 36),  p  has  a  unique  isometric 
extension  to  a  map  of  the  completed  tensor  product  L2 (Xi)(g)L2 (X2)  into 
L2(Xi  xX2). 

It  remains  only  to  show  that  p  is  surjective.  Since  both  measures  are 
cr-finite,  it  is  a  simple  exercise  to  reduce  the  problem  to  the  case  where  p\ 
and  /i 2  are  finite,  which  we  henceforth  assume.  Suppose  pj  G  L2(X  1  x  X2) 
is  orthogonal  to  the  image  of  p.  Then  pj  is  orthogonal  to  the  indicator 
function  of  every  measurable  rectangle,  and  hence  to  the  indicator  function 
of  any  finite  disjoint  union  of  measurable  rectangles.  The  collection  A  of 
such  disjoint  unions  is  an  algebra  of  sets.  Let  Xi  denote  the  collection  of 
measurable  subsets  E  of  X\  x  X2  such  that  the  integral  of  pj  over  E  is  zero. 
Then  Xi  is  a  monotone  class  containing  A.  By  the  monotone  class  lemma 
(Theorem  A. 8),  Xi  contains  the  cr-algebra  generated  by  A ,  which  is  the 
cr-algebra  on  which  pi  x  p2  is  defined.  Thus,  the  integral  of  p;  over  every 
measurable  set  is  zero,  which  implies  that  ip  is  zero  almost  everywhere.  ■ 
The  preceding  example  suggests  the  following  general  principle. 

Axiom  9  The  Hilbert  space  for  a  composite  system  made  up  of  two  sub¬ 
systems  is  the  Hilbert  tensor  product  HiG)H2  of  the  Hilbert  spaces  Hi  and 
H2  describing  the  subsystems. 

If  A  and  B  are  bounded  operators  on  Hi  and  H2,  respectively,  then  there 
is  a  unique  bounded  operator  A  0  B  on  HiG)H2  such  that 

(. A  0  B)(p  0  p)  =  (Ap)  0  (Bp;) 

for  all  p  G  Hi  and  p  G  H2.  (See  Appendix  A. 4. 5.) 

Theorem  19.13  Suppose  that  p  is  a  density  matrix  on  H10H2.  Then 
there  exists  a  unique  density  matrix  p on  Hi  with  the  property  that 

trac e(p^  A)  =  trace(p(A  0  I))  (19.5) 

for  all  A  G  £>(Hi).  We  call  p ^  the  partial  trace  of  p  with  respect  to  EL2.  If 
{fk}  is  an  orthonormal  basis  for  H2,  then  the  operator  p ^  satisfies 

=  LA®  fk,p{tp®fk))  (19.6) 

k 

for  all  <p,  pj  G  Hi.  Similarly,  there  is  a  unique  density  matrix  p ^  on  H2 
satisfying  trace(//2)p>)  =  trace(p(7  0  B))  for  all  B  G  B(H2).  If  {ej}  is  an 
orthonormal  basis  for  Hi,  then  p^  satisfies 

(<i),  y2X)  =  E  g  ®  p(ej  ®  ip))  (i9.7) 

3 

for  all  p,  pj  G  H2. 


19.5  Composite  Systems  and  the  Tensor  Product 
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The  motivation  for  the  terminology  “partial  trace”  is  provided  by  (19.6) 
and  (19.7),  which  are  similar  to  the  formula  for  the  trace  of  an  operator, 
except  that  the  sums  range  only  over  a  basis  for  one  of  the  two  Hilbert 
spaces.  One  special  case  of  Theorem  19.13  is  the  one  in  which  the  density 
matrix  p  is  of  the  form  p  =  pi  where  pi  and  p2  are  density  matrices  on 
Hi  and  H2,  respectively.  (Any  operator  p  of  this  form  is  a  density  matrix 
on  Hi  x  H2.)  In  that  case,  it  is  not  hard  to  see  that  p ^  =  pi  and  p ^  =  p2- 
We  may  describe  this  case  by  saying  that  the  state  of  the  first  system  is 
“independent”  of  the  state  of  the  second  system. 

Lemma  19.14  For  any  sequence  An  E  S(Hi),  if  ||An0- A0||  -4  0  for 
some  A  E  B( H)  and  all  0  E  Hi,  then 

\\(An  0  7)0  —  (A  0  7)0||  — >  0 

for  all  0  E  Hi  0H2.  A  similar  result  holds  for  operators  of  the  form  l0Bn. 
Proof.  See  Exercise  9.  ■ 

Proof  of  Theorem  19.13.  The  existence  and  uniqueness  of  p^  and  p ^ 
follow  from  Lemma  19.14  and  Theorem  19.9.  Meanwhile,  if  {ej}  is  an 
orthonormal  basis  for  Hi  and  {fk}  is  an  orthonormal  basis  for  H2,  we 
have 


(0,  p^'ip)  =  trac e(p^  10X01) 

=  ^2  (ej  <g>  fk,p( IV’X^I  ®  -0(ei  ®  fk)) 


=  E  0  ei)  0  /*)) 


e, 


fk) 


k 


3 


=  E  0  /fe’  p&  0  /fe))  • 


k 


This  is  the  desired  formula  for  (0,  p(H0)  .  Note  that  because  p  is  trace  class 
and  |0X0|  <8>7  is  bounded,  p(|0)(0|  (8)7)  is  trace  class,  in  which  case  the  sum 
in  the  second  line  is  absolutely  convergent,  by  Proposition  19.3.  Thus,  we 
are  allowed  to  rearrange  the  sum  freely.  ■ 

Suppose  we  have  two  quantum  systems  with  Hilbert  spaces  Hi  and  H2 


and  Hamiltonians  Hi  and  H2 .  If  the  two  systems  do  not  interact  with  each 
other  and  the  composite  system  is  initially  in  a  (pure)  state  of  the  form 
0o  (8)  0o  1  then  we  expect  that  at  some  later  time,  the  composite  system  will 
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be  in  the  state  <f(f)  ®  f)(t),  where  =  e  lt^1/h^p0  and  ip(t)  =  e 
Ignoring  domain  considerations,  we  may  compute  that 


(■ Hi<p(t ))  8>  +  <p(t)  <8>  (H2ip(t)) 

(Hj  <8>  I  +  I  <8>  H2)(4>(t )  8> 


This  calculation  suggests  that  the  correct  Hamiltonian  for  a  noninteracting 

/\  /s 

composite  system  is  the  operator  Hi  ®  I  +  I  ®  H2 . 

It  is  not,  however,  obvious  how  to  select  a  domain  for  Hi  (g)  I  +  I  (g)  H2 
in  such  a  way  that  this  operator  will  be  self-adjoint.  (The  reader  is  invited 
to  try  to  choose  such  a  domain  “by  hand.”)  The  easiest  way  to  deal  with 
this  issue  is  to  use  Stone’s  theorem,  as  in  the  following  definition. 

Definition  19.15  If  A  and  B  are  self-adjoint  operators  on  Hi  andH2,  de¬ 
fine  the  operator  A® I 1®  B  to  be  the  infinitesimal  generator  of  the  strongly 
continuous  one-parameter  unitary  group  eltA  ®  eltB .  Thus,  by  Stone’s  the¬ 
orem,  A®  I  1  ®  B  is  self-adjoint . 

It  is  not  hard  to  check  that  eztA  ®  eltB  is  indeed  strongly  continuous.  In 
the  case  B  =  0,  the  operator  A®  I  is  defined  as  the  infinitesimal  generator 
of  eltA  ®I.  If  A  and  B  happen  to  be  bounded,  then  A®  I 1 ®  B  defined  by 
Definition  19.15  coincides  with  A  ®  I  +  I  ®  B  defined  as  the  sum  of  tensor 
products  of  bounded  operators,  as  in  Sect.  A. 4. 5. 

Axiom  10  Suppose  Hi  and  H2  are  the  Hilbert  spaces  for  two  quantum 

systems,  with  Hamiltonians  Hi  and  H 2 ,  respectively.  Then  the  Hamiltonian 

/\  /\ 

for  the  noninteracting  composite  system  is  Hi®I+I®H2,  where  the  domain 

/\  /\ 

of  Hi  ®  I  +  I  ®  H2  is  as  in  Definition  19.15. 

/\  /S  /\ 

A  physicist  would  write  Hi  ®  I  +  I  ®  H2  simply  as  Hi  +  U2,  with  the 

/\ 

understanding  that  Hi  acts  only  on  the  first  factor  in  the  tensor  product 

/s 

and  H2  acts  only  on  the  second  factor. 

In  general,  the  two  components  of  a  composite  system  will  interact,  in 
which  case  the  Hamiltonian  for  the  composite  system  is  typically  of  the 
form 

H  =  Hi  ®  I  +  I  ®  H2  T"  -Hint? 

where  iAnt  is  an  “interaction  term.”  Often,  the  interaction  term  may  be 
considered  “small”  compared  with  the  other  terms  in  the  Hamiltonian. 
Consider,  for  example,  a  system  consisting  of  particles  in  a  box,  with  a 
barrier  dividing  the  box  in  half.  Suppose  the  particles  interact  by  means  of 
a  two-particle  potential  of  the  form  Jf  j<k  V (x-7  —  x.k)  (Sect.  2.3.2)  and  that 

V  {yf  —  yf)  is  very  small  unless  the  two  particles  are  close  together.  There 
will  typically  be  far  more  pairs  of  nearby  particles  in  which  the  two  particles 
are  on  the  same  side  of  the  box  than  nearby  pairs  on  opposite  sides.  Thus, 
even  though  the  interaction  between  the  two  systems  may  substantially 
affect  the  behavior  of  the  composite  system  over  long  periods  of  time,  it  is 
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still  reasonable  to  think  of  Hi  (g)  I  as  “the  energy  of  the  first  subsystem” 

/\ 

and  I  ®  H2  as  “the  energy  of  the  second  subsystem.” 

Suppose  we  start  out  in  a  state  p  of  the  composite  system  for  which 
the  state  p^  of  the  first  subsystem  is  a  pure  state.  If  the  system  is  an 
interacting  one,  the  first  subsystem  will  probably  not  remain  in  a  pure 
state  at  later  times.  Indeed,  suppose  that  the  second  subsystem  is  very 
large  system  having  temperature  T.  Then,  according  to  the  postulates  of 
quantum  statistical  mechanics,  we  are  supposed  to  believe  that  once  the  two 
systems  have  reached  thermal  equilibrium,  the  state  of  the  first  subsystem 
will  be  given  by  the  following  highly  mixed  state: 


P 


PHi 


(19.8) 


Here  / 3  =  l/(/c#T),  where  Pb  is  Boltzmann’s  constant,  and  Z(f3)  is  a  nor¬ 
malization  constant,  known  as  the  partition  function  of  the  theory,  given 
by  Z(/3)  =  trace(e_/3i^1). 

Of  course,  for  this  idea  to  make  sense,  e~l3Hl  must  be  trace  class.  This 

/\ 

will  be  the  case  provided  that  H 1  has  discrete  spectrum  with  eigenvalues 
tending  to  +00  at  some  reasonable  rate.  Thus,  in  quantum  statistical  me¬ 
chanics,  the  expectation  value  of  some  observable  A  for  the  first  subsystem 
will  be  (once  equilibrium  is  reached) 

(A)  =  -^trace^-^1  A).  (19.9) 

Zj 


In  particular,  when  A  =  H 1,  (19.9)  provides  a  natural  generalization  of 
Planck’s  model  of  blackbody  radiation;  compare  Exercise  2  in  Chap.  1. 
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As  discussed  in  Sect.  17.8,  each  type  of  particle  (electron,  proton,  neutron, 
etc.)  has  a  spin  /,  where  the  possible  value  for  l  are 


The  Hilbert  space  for  a  particle  moving  in  M3  and  having  spin  l  is  L2(M3)(g) 
Vi ,  where  Vi  is  a  finite-dimensional  Hilbert  space  that  carries  an  irreducible 
projective  unitary  representation  of  SO (3)  of  dimension  21  +  1.  There  is  a 
natural  unitary  identification  of  L  2(M3)C)1C  with  L2  (M3;Cr),  the  space  of 
square-integrable  functions  on  M3  with  values  in  Vj ,  in  which  the  element 
(8)  v  of  L  2(M3)(g)VJ  is  identified  with  the  function 


X  2p(x)v 


in  L2(M3;  Vi). 
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Now,  we  have  already  mentioned,  in  Sect.  3.11,  the  idea  that  in  quantum 
mechanics,  identical  particles  are  indistinguishable.  Let  us  think  about  this 
in  the  case  of  two  identical  particles  with  spin  l.  Our  first  guess  as  to 
the  Hilbert  space  for  such  a  system  is  the  tensor  product  of  two  copies  of 
L2(M3;  Vi),  which  may  be  identified  with 

L2(R6;C®C)- 


If  ^  is  a  unit  vector  in  this  space,  thought  of  as  a  pure  state,  then  saying  that 
the  two  particles  are  “indistinguishable”  means  that  ^(x^x1)  should  rep¬ 
resent  the  same  physical  state  as  ^(x^x2),  that  is,  ^(x^x1)  =  ct/^x^x2) 
for  some  nonzero  constant  c.  Applying  this  rule  twice  shows  that  c  must 
be  either  1  or  —1. 

A  variety  of  theoretical  and  experimental  considerations  suggest  the  fol¬ 
lowing  principle:  For  particles  with  integer  spin  (Z  =  0, 1, . . .),  the  constant 
c  in  the  preceding  paragraph  is  1,  whereas  for  particles  with  half-integer 
spin  (/  =  1/2,  3/2,...)  the  constant  c  is  —1.  Particles  with  integer  spin 
are  called  bosons  and  particles  with  half-integer  spin  are  called  fermions. 
We  encode  the  discussion  in  the  two  preceding  paragraphs  in  the  following 
axiom. 


Axiom  11  Consider  a  collection  of  N  identical  particles  moving  in  M3 
and  having  integer  spin  l.  Then  the  Hilbert  space  for  such  a  collection  is  the 
subspace  of  L2(R3N;  (Vi)®N)  consisting  of  those  square-integr able  functions 
if  for  which 


xff(Ar))  =  V>(x 


1 


for  every  permutation  a.  Consider  also  a  collection  of  N  identical  particles 
moving  in  M3  and  having  half-integer  spin  l.  Then  the  Hilbert  space  for 
such  a  collection  is  the  subspace  of  L2(R3N ;  (Vi)®N)  consisting  of  those 
square-integrable  functions  fj  for  which 

^(x^1),  x^2), . . .  ,xcr(Ar))  =  sign(cr)?/(x1,  x2, . . . ,  x^) 


for  every  permutation  a. 

One  may  well  ask  why  Axiom  11  holds.  More  specifically,  one  may  first 
ask  why  it  is  that  identical  particles  are  indistinguishable,  and  then  sepa¬ 
rately  ask  why  integer-spin  particles  are  bosons  and  half-integer-spin  par¬ 
ticles  are  fermions.  Both  questions  are  best  answered  from  the  point  of 
view  of  quantum  field  theory,  to  which  ordinary  nonrelativistic  quantum 
mechanics  is  an  approximation. 

In  field  theory,  one  starts  with  a  “classical”  field  theory,  meaning  a  dif¬ 
ferential  equation  for  functions  </>(x,  t)  on  M4  with  values  in  some  finite¬ 
dimensional  vector  space.  Electromagnetic  fields,  for  example,  are — at  any 
one  fixed  time — functions  on  M3  with  values  in  M6,  where  M6  describes 
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the  three  components  of  the  electric  field  and  the  three  components  of  the 
magnetic  field.  These  functions  on  M3  then  evolve  in  time  according  to 
Maxwell’s  equation.  In  quantum  field  theory,  one  regards,  say,  Maxwell’s 
equations  as  a  sort  of  infinite-dimensional  dynamical  system,  which  we  may 
quantize  in  something  like  the  way  we  quantize  Newton’s  equation  to  get 
ordinary  nonrelativist ic  quantum  mechanics.  In  the  quantum  version  of 
Maxwell’s  equations,  the  energy  in  each  mode  of  the  fields  is  “quantized,” 
meaning  that  one  can  only  add  energy  to  each  mode  in  multiples  of  a  certain 
unit  (or  “quantum” )  of  energy.  This  is  analogous  to  the  quantum  harmonic 
oscillator,  in  which  the  allowed  energies  differ  by  integer  multiples  of  the 
huo.  In  quantum  field  theory,  then,  a  particle  is  one  quantum  of  excitation 
of  a  certain  field. 

For  simplicity,  let  us  think  of  a  field  theory  in  which  the  classical  fields 
take  values  in  R.  Then  even  at  the  classical  level,  it  is  possible  to  think 
that  we  have  something  like  particles,  namely  localized  bumps  in  the  field 
0(x)  located  at  several  different  points  in  space.  These  bumps  might,  for 
example,  be  in  the  shape  of  a  Gaussian  wave-packet,  that  is,  a  Gaussian  en¬ 
velope  multiplied  by  a  sinusoidally  oscillating  function.  From  this  point  of 
view,  we  can  gain  some  understanding  of  why  identical  particles  are  indis¬ 
tinguishable.  Suppose  we  have  a  Gaussian  wave  packet  near  a  point  a  in  M3 
and  then  an  identically  shaped  Gaussian  wave  packet  near  another  point  b. 
The  state  </>(x)  of  the  field  is  precisely  the  same  as  if  we  have  a  packet  near 
b  and  then  also  a  packet  near  a.  That  is  to  say,  there  is  no  distinct  state  of 
the  system  that  corresponds  to  interchanging  the  two  particles;  whichever 
bump  we  think  of  as  the  “first”  particle,  we  have  the  same  field  </>(x).  Even 
in  the  quantum  version  of  such  a  system,  there  no  meaning  to  asking  which 
is  the  first  particle  and  which  is  the  second.  Thus,  even  in  nonrelativistic 
quantum  mechanics,  which  is  a  low-energy  approximation  to  quantum  field 
theory,  we  expect  identical  particles  to  be  indistinguishable. 

Although  the  preceding  discussion  does  not  explain  the  distinction  be¬ 
tween  bosons  and  fermions,  that  distinction  also  emerges  from  quantum 
field  theory,  through  something  called  the  spin-statistics  theorem 
(see,  e.g.,  [38]). 


19.7  “Statistics”  and  the  Pauli  Exclusion  Principle 

The  spin  of  an  electron  is  equal  to  1/2  and  electrons  are,  therefore,  fermions. 
The  famous  Pauli  exclusion  principle  is  a  consequence  of  the  fermionic 
nature  of  electrons.  Pauli’s  principle  states  that  two  electrons  cannot  be 
in  the  same  state  at  the  same  time.  This  means  that  if  1/  is  a  square- 
integrable,  C2-valued  function  on  M3  (which  could  describe  the  state  of  a 
single  electron),  then  the  function  T  :  M6  C2  <g>  C2  given  by 

^(x1,  x2)  =  i/^x1)  G  ^(x2) 
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is  not  a  possible  state  of  a  two-electron  system,  since  4/  does  not  satisfy 
Axiom  11.  On  the  other  hand,  if  ipi  and  1/2  are  two  linearly  independent 
elements  of  L2(M3;  C2),  then  the  function  4?  :  M6  — )•  C2  (g)  C2  given  by 

^(x^x2)  =  ■i/’i(x1)V’2(x2)  -  V’2(x1)-i/’i(x2)  (19.10) 

is  a  possible  state  of  a  two-electron  system.  [If  ^1  and  1/2  are  indepen¬ 
dent,  then  $  is  a  nonzero  element  of  L2(M6;C2  (g)  C2),  which  can  then  be 
normalized  to  be  a  unit  vector.  See  Exercise  10. 

Let  us  try  to  understand  the  implications  of  the  Pauli  exclusion  principle 
for  multielectron  atoms.  Let  us  model  an  TV-electron  atom  as  having  a 
nucleus  with  positive  charge  Nq ,  where  the  charge  of  a  single  electron  is 
—q.  Since  the  nucleus  is  much  more  massive  than  the  electrons,  we  can 
treat  the  nucleus  as  being  fixed  and  the  electrons  as  moving  in  potential 
of  the  form  —Nq/  |x|  .  As  a  very  rough  approximation  to  the  structure  of 
such  an  atom,  we  can  ignore  the  electron-electron  interaction  and  take  a 
Hamiltonian  of  the  form 


where  A-7  is  the  Laplacian  acting  on  the  jth  variable.  That  is,  we  are  taking 
our  Hamiltonian  to  be  simply 

where  H  is  the  Hamiltonian  for  a  single  electron. 

If,  say,  N  is  even,  the  lowest-energy  state  for  this  Hamiltonian  in  the 
antisymmetric  subspace  of  L2(M3Ar;  (C2)®^)  will  be 

^olxtx2, . . .  ,xJV) 

=  AS  (A (x1)  ®  V>0  (x2)  ®  ipt (x3)  <8>  •  •  •  ®  V,t/2(xA,“1)  ®  • 

(19.11) 


If  N  is  odd,  the  product  ends  with  The  notation  hr  (19.11) 

is  as  follows.  First,  AS  is  the  antisymmetrization  operator,  given  by 


AS(/)(x] 


,X-N)  =  E  sig n(cr)/(x‘T(1),x‘T(2), 
<t  esN 


x 


a(N) 


)• 


Second,  the  functions  ^0,  Vh,  ^2,  •  •  •  are  the  eigenvectors  in  L  2(M3)  for  the 
Hamiltonian  of  a  single  particle  in  M3  moving  in  a  potential  of  the  form 
—Nq2 /  |x|  ,  arranged  so  that  the  eigenvalues  of  ipj  are  weakly  increasing 


with  j.  The  7/4/ ’ s  are  just  the  states  computed  in  Chap.  18,  but  with  q 
replaced  by VNq.  Third,  (x)  denotes  Vb(x)  ®  ei  and  ^(x)  denotes 
/jj (x)  (g)  e2,  where  {ei,  62}  is  the  standard  basis  for  C2. 


19.7  “Statistics”  and  the  Pauli  Exclusion  Principle  437 


What  the  expression  for  To  means  is  that,  if  we  ignore  (at  first)  the  inter¬ 
action  between  the  electrons,  but  retain  the  Pauli  exclusion  principle,  then 
we  put  the  first  electron  into  the  ground  state  of  the  single-electron  system, 
with  “spin  up”  (i.e.,  tensored  with  e\).  Then  we  put  the  second  electron 
into  the  ground  state  with  “spin  down”  (tensored  with  e 2).  Then  the  third 
electron  goes  into  the  first  excited  state  of  the  single-electron  system  with 
spin  up,  and  so  on.  Of  course,  this  model  of  a  multielectron  atom  is  very 
rough,  since  the  interaction  between  the  electrons  actually  plays  a  signif¬ 
icant  role.  Nevertheless,  this  model  highlights  the  critical  role  played  by 
the  exclusion  principle,  which  forces  successive  electrons  to  go  into  higher 
and  higher  energy  states.  In  particular,  this  crude  approximation  suggests 
(correctly!)  that  even  for  more  realistic  models  of  a  multielectron  atom,  the 
lowest  energy  level  in  the  antisymmetric  subspace  of  L  2(]R3Ar;  (C2)®N)  is 
much  higher  than  the  lowest  energy  level  of  the  same  Hamiltonian  in  all  of 

L2(Rsn](C2)®n). 

Meanwhile,  in  quantum  statistical  mechanics,  one  considers  a  large  col¬ 
lection  of  identical  particles  confined  to  some  finite  region  of  space.  If  the 
system  is  isolated  (rather  than  in  thermal  equilibrium  with  its  environ¬ 
ment),  the  goal  of  statistical  mechanics  is  to  “count”  the  number  N(E)  of 
quantum  states  with  energy  less  than  E,  as  a  function  of  E.  [That  is,  N(E) 
is  number  of  eigenvalues  for  the  Hamiltonian  less  than  E ,  counted  with  their 
multiplicity.]  As  the  preceding  discussion  of  the  Pauli  exclusion  principle 
suggests,  we  will  get  very  different  answers  for  N(E)  if  the  particles  are 
fermions  than  if  they  are  bosons.  Bosons  are  said  to  follow  Bose-Einstein 
statistics ,  whereas  fermions  are  said  to  follow  Fermi-Dirac  statistics.  The 
term  “statistics”  here  refers  to  the  different  behavior  of  the  two  types  of 
particles  in  quantum  statistical  mechanics.  The  spin-statistics  theorem  in 
quantum  field  theory  tells  us  that  particles  with  integer  spin  have  to  be 
bosons  (obeying  Bose-Einstein  statistics)  and  particles  with  half-integer 
spin  have  to  be  fermions  (obeying  Fermi-Dirac  statistics). 

One  fascinating  example  of  quantum  statistical  mechanics  occurs  when 
the  particles  are  bosons  and  the  interaction  between  particles  is  negligible. 
In  that  case,  the  lowest  energy  state  will  simply  be 

^'o(x1,x2, . . .  ,xN)  =  ipoix1)  <8>^o(x2)  <E>  •  •  •  ®  ipo(xN), 

where  ^0  is  the  ground  state  of  the  single-particle  system.  Now,  quantum 
statistical  mechanics  tells  us  that  at  a  given  temperature,  the  state  of  the 
system  will  be  an  (incoherent)  superposition  of  the  ground  state  and  the 
various  excited  states.  If  the  temperature  is  low  enough,  then  the  coeffi¬ 
cient  of  the  ground  state  will  be  close  to  1,  and  thus,  “all  the  particles  are 
in  the  ground  state.”  A  system  in  such  a  state  is  called  a  Bose-Einstein 
condensate ,  a  state  that  was  predicted  on  theoretical  grounds  by  Satyendra 
Nath  Bose  and  Einstein  in  the  1920s.  Bose-Einstein  condensates  were  first 
observed  experimentally  in  laser-cooled  gases  in  June  1995  by  Eric  Cornell 
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and  Carl  Wieman,  in  work  for  which  they,  along  with  Wolfgang  Ketterle, 
were  awarded  the  2001  Nobel  Prize  in  physics. 


19.8  Exercises 

1.  Suppose  that  X  is  a  Hilbert-Schmidt  operator  on  H  and  that  {ey }  is 
an  orthonormal  basis  for  H.  Show  that 


HS  ~  \  (e3’Xek) 

3  A 


2.  Given  (/>,  ip  £  H,  let  \<p){^\  denote  the  operator  defined  in  Notation  3.28. 
Show  that  if  A  £  £>( H)  is  trace  class,  then 


trace(M  I^X^I)  =  (i i>) . 


Hint :  If  {ej}  is  an  orthonormal  basis  for  H,  then  for  any  \  G  H,  we 
have  x  =  ilj  (sj ,  x)  ej- 

3.  Suppose  4>  :  Z3( H)  C  is  a  linear  functional  with  the  properties 

(1)  that  <f>(A)  is  real  whenever  A  is  self-adjoint  and  (2)  that  4>(A) 
is  real  and  non- negative  whenever  A  is  self-adjoint  and  non- negative. 
Show  that  if  A  is  self-adjoint,  then 


-\\A\\<5>(I)  <  <4>(A)  <  \\A\\<5>(I). 

Conclude  that  $  is  bounded  relative  to  the  operator  norm  on  /3(H). 

Hint :  Show  that  if  A  is  self-adjoint,  then  ||M||  I  +  A  and  \\A\\  I  —  A  are 
non- negative. 

4.  An  operator  A  £  /3(H)  is  said  to  have  finite  rank  if  range(A)  is  finite 
dimensional. 


(a)  Show  that  if  A  £  /3(H)  has  finite  rank,  then  so  does  A*. 

(b)  Given  A  £  /3(H),  show  that  A  has  finite  rank  if  and  only  if  there 
exist  vectors  </>i, . . . ,  (j)N  and  Vh ,  •  •  •  >  Vkv  such  that 


A  —  |</>i)(Vd|  H - h  | 0n)(^n 


(c)  Let  A  be  any  element  of  /3(H),  let  {ej}  be  an  orthonormal  basis 
for  H,  and  let  P/v  be  the  orthogonal  projection  onto  the  span 
of  ci, . . . ,  ejsf.  Show  that  P/v  A  has  finite  rank  and  that  for  all 
^  £  H,  we  have 


lim 

N^-oo 


PnAtJj  —  A'lp  ||  =  0. 
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Note :  This  result  shows  that  each  bounded  operator  can  be  ex¬ 
pressed  as  a  strong  limit  of  finite-rank  operators.  By  contrast, 
if  dim  H  =  oo,  then  Part  (a)  of  Exercise  5  shows  that  not  every 
bounded  operator  can  be  expressed  as  an  operator-norm  limit 
of  finite-rank  operators. 


5.  In  this  exercise,  assume  that  dimH  =  oo. 


(a)  Show  that  if  A  has  finite  rank,  then  \\A  -j-  cl\\  >  \c\  for  any  c  £  C. 
(With  c  =  —  1,  this  shows  that  I  is  not  an  operator-norm  limit 
of  finite-rank  operators.) 

(b)  Let  /C(H)  denote  the  closure  of  the  finite-rank  operators  with 
respect  to  the  operator  norm  on  6(H).  Let  V  denote  the  space 
of  operators  of  the  form  B  +  c/,  with  B  £  /C(H).  Define  a  linear 
functional  <f>  :  V  -£  C  by  <f>(6  +  cl)  =  c  for  all  B  £  /C(H).  Show 
that  |(£>(4L)|  <  ||  A ||  for  all  A  £  V. 

Note :  It  can  be  shown  that  /C(H)  is  precisely  the  space  of 
compact  operators  on  H. 


Let  Ti  :B(H)  -£  C  be  any  linear  functional  such  that  Ti  =  <f>  on 
V  and  such  that  |4/i(AL)|  <  \\A\\  for  all  A  £  6(H).  (Such  a  func¬ 
tional  exists  by  the  Hahn-Banach  theorem.)  Let  4/ 2  :  6(H)  -£  C 
be  defined  by 


^2  (A)  =  +  'I'lfX*))- 

Show  that  4/ 2  satisfies  Properties  1,2,  and  3  of  Definition  19.6, 
but  that  there  does  not  exist  any  density  matrix  p  such  that 
^2(A)  =  trac e(pA)  for  all  A  £  6(H).  (Thus,  in  light  of 
Theorem  19.9,  T 2  must  not  satisfy  Property  4  of  Definition  19.6.) 

6.  In  Exercises  6,  7,  and  8,  assume  that  each  density  matrix  p  is 
compact,  so  that  p  has  an  orthonormal  basis  {ej}  of  eigenvectors,  for 
which  the  associated  eigenvalues  {Ay }  are  real  and  tend  to  zero  as  j 
tends  to  infinity.  (Compare  Theorem  VI. 16  in  [34].) 

Show  that  a  density  matrix  p  is  a  pure  state  if  and  only  if  trace(p2)  =  1. 


7.  (a)  Show  that  each  mixed  state  p  is  a  nontrivial  convex  combination 

of  other  density  matrices. 

(b)  Show  that  a  pure  state  cannot  be  expressed  as  a  nontrivial  convex 
combination  of  other  density  matrices. 

Hint :  Show  that  the  function  /(A)  :=  trace  ^(Api  -j-  (1  —  \)p2)2^j  is  a 
convex  function  of  A. 
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8.  For  any  density  matrix  p,  show  that  the  von  Neumann  entropy  S(p)  := 
trace(— plogp)  is  zero  if  and  only  if  p  is  a  pure  state. 

9.  Prove  Lemma  19.14. 

Hint:  First  use  the  principle  of  uniform  boundedness  (Theorem  A. 40) 
to  show  that  there  exists  a  constant  C  with  ||An||  <  C  for  all  n.  Then,  if 
{ fj}  is  an  orthonormal  basis  for  H2,  decompose  H10H2  as  the  Hilbert 
space  direct  sum  of  the  subspaces  Hi  (8)/j,  where  each  of  these  subspaces 
is  isometrically  identified  with  Hi  in  the  obvious  way. 

10.  Suppose  that  ^1  and  ^2  are  two  linearly  independent  elements  of 
L2(M3;C2).  Show  that  the  function  <f>  in  (19.10)  is  a  nonzero  element 
of  L2(M6;C2  (g)C2). 


20 

The  Path  Integral  Formulation 
of  Quantum  Mechanics 


We  turn  now  to  a  topic  that  is  important  already  for  ordinary  quantum 
mechanics  and  essential  in  quantum  field  theory:  the  so-called  path  inte¬ 
gral.  In  the  setting  of  ordinary  quantum  mechanics  (of  the  sort  we  have 
been  considering  in  this  book),  the  integrals  in  question  are  over  spaces  of 
“paths,”  that  is,  maps  of  some  interval  [a,  b }  into  Mn.  In  the  setting  of  quan¬ 
tum  field  theory,  the  integrals  are  integrals  over  spaces  of  “fields,”  that  is, 
maps  of  some  region  inside  into  Mn.  Formal  integrals  of  this  sort  abound 
in  the  physics  literature,  and  it  is  typically  difficult  to  make  rigorous  math¬ 
ematical  sense  of  them — although  much  effort  has  been  expended  in  the 
attempt!  In  this  chapter,  we  will  develop  a  rigorous  integral  over  spaces  of 
paths  by  using  the  Wiener  measure ,  resulting  in  the  Feynman-Kac  formula. 

We  begin  with  the  Trotter  product  formula,  which  will  be  our  main  tool 
in  deriving  the  path  integral  formulas.  From  there  we  turn  to  the  (heuristic) 
path  integral  formula  of  Feynman,  and  then  to  the  rigorous  version  of 
Feynman’s  result  obtained  by  M.  Kac,  the  so-called  Feynman-Kac  formula. 
Although  it  is  not  feasible  to  give  complete  proofs  of  all  results  presented 
here,  we  give  enough  proofs  to  get  a  flavor  of  the  mathematics  involved. 
We  will  prove  a  version  of  the  Trotter  product  formula  and,  assuming  the 
existence  of  the  Wiener  measure,  a  version  of  the  Feynman-Kac  formula. 


B.C.  Hall,  Quantum  Theory  for  Mathematicians ,  Graduate  Texts 
in  Mathematics  267,  DOI  10.1007/978-l-4614-7116-5_20, 

©  Springer  Science+Business  Media  New  York  2013 


441 


442 


20.  The  Path  Integral  Formulation  of  Quantum  Mechanics 


20.1  Trotter  Product  Formula 


The  Lie  product  formula  (Point  7  of  Theorem  16.15)  says  that  for  all  X 
and  Y  in  Mn(C),  we  have 

ex+Y  =  lim  (ex/mer/m)m. 

m— t  oo 

The  Trotter  product  formula  asserts  that  a  similar  result  holds  for  certain 
classes  of  unbounded  operators  on  Hilbert  spaces. 

Theorem  20.1  (Trotter  Product  Formula)  Suppose  that  A  and  B  are 

self-adjoint  operators  on  H  and  that  A+B  is  densely  defined  and  essentially 
self-adjoint  on  Dom(A)  D  Dom(T>).  Then  the  following  results  hold. 

1.  For  all  ip  G  H,  we  have 


lim 

TV— )>oo 


eit(A+B )^p  _  ^itA/N eitB/N^N ^ 


(20.1) 


2.  If  A  and  B  are  bounded  below,  then  for  all  ip  E  H,  we  have 


lim 

TV— too 


-t(A+B)  /  _ 


•0  -  (e 


-tA/N  -tB 

C 


N 


if 


(20.2) 


In  both  results,  the  expression  A  +  B  refers  to  the  unique  self-adjoint  ex¬ 
tension  of  the  operator  defined  on  Dom(A)  D  Dom(T>). 

In  the  usual  terminology  of  functional  analysis,  (20.1)  asserts  that  the 
operators  (eltA/N +tB/N^N  converge  to  elTA+B)  in  the  “strong  operator 
topology,”  and  similarly  with  (20.2). 

We  will  give  a  proof  of  this  result  in  the  special  case  in  which  A  +  B 
is  densely  defined  and  self-adjoint  on  Dom(A)  D  Dom(T>).  This  condition 
holds,  for  example,  whenever  the  Kato-Rellich  theorem  (Theorem  9.37) 
applies.  See  Sect.  A. 5  of  [14]  for  a  proof  of  the  version  stated  above. 
Proof.  Since  all  the  operators  in  Point  1  of  the  theorem  are  unitary,  it 
is  easy  to  see  that  if  the  result  holds  on  some  dense  subspace  W  of  H, 
it  holds  on  all  of  H.  In  Point  2  of  the  theorem,  we  first  make  a  simple 
reduction  to  the  case  where  A  and  B  are  non- negative,  and  then  have  the 
same  conclusion,  since  all  operators  involved  will  then  be  contractions. 

We  will  prove  Point  1  of  the  theorem,  with  the  proof  of  Point  2  being  sim¬ 
ilar.  Let  us  introduce  the  notation  Ss  :=  els<yAAB">  and  Ts  :=  elsAelsB .  What 
we  want  to  prove  is  that  for  each  ip  E  H,  the  quantity  II (St  —  ( Tt/N)N)ip 
tends  to  zero  as  N  tends  to  infinity.  Now,  a  simple  calculation  shows  that 


(St  -  {Tt/N)N)ip 


N—l 


^  (rt/Ny(st/N  —  Tt/N)(st/N)N  j 

3= 0 


.  (20.3) 
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Since  S.  is  a  one-parameter  unitary  group,  (St/jy)N  J  lrp  =  Ss ip,  where 
s  =  (N  —  j  —  l)t/N.  Thus,  if  we  let  ips  =  Ssip,  we  have 


(St  -  (Tt/N)N)iP 


<  N  sup 

0<s<t 


(St/N  —  U/N^s  . 


(20.4) 


Now,  for  any  ip  in  Dom(A  +  B),  we  have 


lim  N(St/Nip  -ip)  =  it(A  +  B)ip , 

TV— ^OO 

by  Stone’s  theorem.  Meanwhile,  according  to  Exercise  2,  we  have 


1 

lim.  —(Ts  —  I)ip  =  iAip  +  iBip,  (20.5) 


for  all  ip  G  Dom(A)  D  Dom (B).  (This  result  is  clear  at  the  heuristic  level.) 
Thus, 


lim  N(St/ N  -  Tt/N)ip  =  lim  N(St/ N  -  I)ip  -  lim  N(Tt/N  -  I)ip 

TV— ^oo  TV— )>oo  TV— ^oo 

=  it  (A  +  B)ip  —  it  (A  +  B)ip  =  0  (20.6) 


for  every  ip  G  Dom(A)  n  Dom (B). 

Let  W  =  Dom(A)  D  Dom (B),  which  is,  by  assumption,  dense  in  H, 
equipped  with  the  norm  given  by 


ip\\  +  \\(A  +  B)ip\ 


Since  we  are  assuming  A  +  B  is  self-adjoint,  and  thus  also  closed,  on  W, 
we  see  that  W  is  a  Banach  space  with  respect  H*^  (Exercise  6  in  Chap.  9). 
Now,  the  operators  N(St/w  —  Tt/^)  are  certainly  bounded  from  W  to  H, 
for  each  N.  Furthermore,  (20.6)  shows  that  for  each  ip  G  W,  we  have 


sup  iV(5t/JV  -Tt/N)ip 

TV 


<  00. 


Thus,  by  the  principle  of  uniform  boundedness  (Theorem  A. 40),  there  is  a 
constant  C  such  that 


N(St/N-Tt/N)ip\\<C\\iP 


for  all  ip  G  W.  It  then  follows  (Exercise  3)  that  1 1 AT(#S,tyjV  —  TtjN)ip  tends 
to  zero  uniformly  on  every  compact  subset  of  W. 

Suppose,  now,  that  for  each  ip  G  W,  the  s  ips  is  continuous  in  W.  If 
so,  the  image  of  the  compact  interval  [0,  t\  under  s  i-T  ips  will  be  compact 
in  W,  and  so  ||fV(At/7v  —  Tt/N)ips  will  tend  to  zero  uniformly  in  s.  Thus, 
by  (20.4),  we  will  have  Point  1  of  the  theorem.  To  establish  the  desired 
continuity,  we  first  note  that  by  Lemma  10.17,  the  operators  Ss  =  els(A+B> 
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preserve  Dom(A  +  T>),  which  is  equal  to  W,  by  assumption.  Then  for  any 
s,  r  G  [0,  t\  and  ^  G  W,  we  have 


,is(A+B) 


,ir(A+JB) 


Ip 


is{A+B)  l  _  ir(A-\-B) 


(e 


V 

is(A+.B)  _  ir(A+JB) 


Ip 


+ 


(A  +  -B)(e 


)V>  +  (e 


>is(A+JB) 


is(A+B)^  _  eir{A+B)^ 

-eir(A+B))(A  +  B)^  , 


(20.7) 


where  we  have  used  Lemma  10.17  again  in  the  second  equality.  The  strong 
continuity  of  els\A+B)  (Proposition  10.14)  then  ensures  that  the  right-hand 
side  of  (20.7)  tends  to  zero  as  s  approaches  r.  ■ 


20.2  Formal  Derivation  of  the  Feynman  Path 
Integral 

In  this  section,  we  apply  Point  1  of  the  Trotter  product  formula  to  the 
operator 


-  \h  =  —A  -  lv(X).  (20.8) 

h  2m  h  y 

Let  us  call  the  operators  on  the  right-hand  side  of  (20.8)  A  and  B,  re- 

spectively,  and  let  us  assume  V  is  sufficiently  nice  that  H  is  essentially 

self-adjoint  on  Dom(A)  n  Dom(T>).  Any  bounded  potential  certainly  has 
this  property,  as  do  many  unbounded  potentials.  (See,  e.g.,  Theorem  9.38.) 
Point  1  of  Theorem  20.1  then  tells  us  that 


e 


lim  exp 

N^oo  V 


ithA  j 

2^v/eXP 


itV(X) 

m 


Under  mild  assumptions  on  ip,  Theorem  4.5  (extended  to  n  dimensions) 
tells  us  that  exp(itHA/ (2mN))  may  be  computed  as 


eithA/(2mN  )^(Xq)  = 


exp 


^(xi)  dxi. 


Meanwhile,  exp(—itV(X)/(NH))  is  simply  a  multiplication  operator. 
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Thus,  assuming  that  Theorem  4.5  applies  at  each  stage,  we  have 

N 


f  ithA  1  r  itv(x) }  .  , 

“"iMr'rivr)1  * ’ 


=  c 


exp  <  i 

>n  1  2tH 


mN  2 

Xi  x0 


exp 


mN 


x  /  exp  <  i 

in  1  2th 


X 


X 


exp  <  i 


XJV-1  —  X_/y_2 

mN  , 

XjV  —  Xjv-1 


exp 


(xo) 

itV(x i)  | 

Vfi  J 


2td 


exp 


Nh 
UV^kn) 

nh 


X  ^(xy)  dxyr  dXjV-1  •  •  •  dx  1, 


where  C  =  {mN / (ith))nN /2 .  Letting  £  =  £/7V  and  assuming  we  can  freely 
rearrange  the  order  of  integration,  we  obtain 


(e 


-it# 


Xo) 


=  lim  C 

TV— )>oo 


exp 


n^N 


f  .  N 

hI> 

m 

xi  -xi- 1 

2 

£ 

3= 1 

x  ^(xyr)  dxi  dx2  •  •  •  dxjv- 


(20.9) 


So  far,  the  argument  is  mostly  rigorous,  coming  from  the  Trotter  product 
formula  and  Theorem  4.5.  The  nonrigorous  part  comes  in  attempting  to 
evaluate  the  limit  on  the  right-hand  side  of  (20.9).  Let  us  think  of  the 
values  Xj,  j  =  0, . . . ,  N  as  constituting  the  values  of  a  path  x(s)  at  the 
points  Sj  :=  je  =  jt/N : 

xj  =  x(jt/N)- 


Since  the  distance  between  Sj. 


i  and  Sj  is  £,  the  quantity  |x7-  —  Xj_i 


£  IS 


an  approximation  to  the  derivative  of  x(s)  with  respect  to  s.  Meanwhile, 
the  sum  over  j  in  the  right-hand  side  of  (20.9)  is  an  approximation  to  an 
integral.  Thus,  if  we  then  take  the  limit  of  the  right-hand  of  (20.9)  in  a 
totally  nonrigorous  fashion,  we  obtain 


(e-^/V)(xo) 


=  C  I  exp 

'paths  with 
x(0)=xo 


l 

h 


m 

~2 


dx 

ds 


y(x(s)) 


ds  >  /0(x(t))  Dx. 


(20.10) 


Here,  C  is  a  normalization  constant  and  Px  is  something  like  “Lebesgue 
measure”  on  the  space  of  all  paths  x(-)  mapping  [0,  t\  into  Mn.  (The  quantity 
x  in  the  expression  Px  is  a  path ,  not  a  point  in  Mn.) 
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The  reader  who  is  familiar  with  the  Lagrangian  approach  to  mechanics 
will  recognize  the  expression  in  square  brackets  in  the  exponent  on  the 
right-hand  side  of  (20.10)  as  the  Lagrangian  of  the  particle,  L  =  T  —  V, 
where  T  =  (l/2)m  \v\  is  the  kinetic  energy  and  V  is  the  potential  energy. 
The  integral  of  the  Lagrangian  over  some  time  interval  is  called  the  action 
functional ,  denoted  by  the  letter  S.  That  is  to  say,  given  a  path  x(-),  we 
define  the  action  functional  of  x(-)  over  a  time-interval  [a,  b }  as  follows: 


<5(x(-),  a,  b)  :=  f 

J  a 


m 

2~ 


dx 

ds 


-C(x(s)) 


ds. 


(20.11) 


In  Lagrangian  mechanics,  one  shows  that  the  solutions  to  Newton’s  law  are 
precisely  the  stationary  points  of  the  action  functional.  Using  the  notation 
in  (20.11),  we  may  rewrite  (20.10)  as 


(e  itH/V)(xo)  =  C  f  hsw.th  exp  U<S(x(0, 0,  i)  U(x(i))  Px.  (20.12) 

x(0)=x0  k  J 

This  formula  is  the  Feynman  path  integral  formula. 

Now,  knowledge  of  Lagrangian  mechanics  is  not  directly  relevant  to  the 
derivation  of  the  Feynman  path  integral  formula.  Nevertheless,  it  is  intrigu¬ 
ing  that  the  an  important  quantity  from  classical  mechanics  should  appear 
in  the  Feynman  path  integral  formula  in  quantum  mechanics.  Indeed,  this 
appearance  raises  the  possibility  that  one  can  use  the  path  integral  formula 
to  make  connections  between  quantum  mechanics  and  classical  mechanics. 
Indeed,  the  “method  of  stationary  phase”  (when  applied,  formally,  in  an 
infinite-dimensional  setting)  asserts  that  for  small  values  of  ft,  the  main 
contribution  to  the  right-hand  side  of  (20.12)  comes  from  regions  near  the 
stationary  points  of  the  action  functional,  namely  the  classical  trajectories. 
Using  this  method,  Gutzwiller  was  able  to  derive  his  famous  trace  formula, 
which  provides  predictions  of  typical  eigenvalue  spacings  for  Schrodinger 
operators  based  on  the  behavior  of  the  underlying  classical  system.  More 
information  about  this  fascinating  subject  can  be  found  in  books  on  “quan¬ 
tum  chaos,”  including  [19]  by  Gutzwiller  himself. 

It  is  notoriously  difficult  to  attach  a  rigorous  meaning  to  the  right-hand 
side  of  the  Feynman  path  integral  formula.  Note  that  the  formal  expression 
“Tx”  is  the  limit  as  N  tends  to  infinity  of  the  integral  over  (M.n)N  in 
(20.9)  with  respect  to  the  Lebesgue  measure  (i.e.,  the  measure  given  by 
dxi  dx 2  •  •  -dxjv)-  Thus,  “Tx”  should  be  something  like  Lebesgue  measure 
on  the  space  of  all  paths  (maps  from  [0,  t]  into  Mn).  However,  it  is  known 
that  an  infinite-dimensional  vector  space  (say,  a  Banach  space)  does  not 
have  any  “reasonable”  (say,  cr-finite)  translation-invariant  measure  that 
could  play  the  role  of  Lebesgue  measure.  Furthermore,  the  absolute  value 
of  the  constant  C  is  easily  seen  to  be  infinite.  Thus,  we  certainly  cannot 
take  the  right-hand  side  of  (20.12)  literally. 
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A  better  approach  is  to  avoid  looking  at  the  component  parts  of  the 
Feynman  path  integral  and  instead  to  look  at  the  whole  expression  against 
which  the  function  ^(x(t))  is  being  integrated.  If  we  could  attach  a  rigorous 
meaning  to  the  expression 


C  exp 


Px, 


(20.13) 


as,  say,  a  complex- valued  measure  on  the  space  of  continuous  paths,  then 
this  could  serve  to  give  a  meaning  to  the  path  integral.  It  is  known,  however, 
that  there  is  no  complex  measure  on  the  space  of  paths  that  makes  the 
Feynman  path  integral  formula  true.  The  oscillatory  behavior  produced  by 
the  i  in  the  exponent  in  (20.13)  makes  it  difficult  to  give  a  rigorous  meaning 
to  the  Feynman  path  integral  in  its  original  form. 


20.3  The  Imaginary-Time  Calculation 


In  trying  to  give  a  rigorous  meaning  to  the  path  integral  formula  of  Feyn¬ 
man,  Kac  proceeded  by  considering  the  “imaginary  time”  time-evolution 
operator  exp(— tH/h),  which  is  just  the  usual  time-evolution  operator 
exp(— itH/h)  evaluated  with  t  replaced  by  —it.  The  idea  is  that  if  one 
can  use  path  integrals  to  understand  the  operators  exp(— tH /K),  one  can 
go  back  to  the  “real  time”  operator  exp  (—itH/h)  by  analytic  continuation 
with  respect  to  t. 

The  counterpart  of  Theorem  4.5  for  exp(— tHA/(2m))  (proved  in  the 
same  way)  is 


(e_i tftA/(2m)^)(Xo) 


^(xi)  dxi. 


Unlike  Theorem  4.5,  however,  the  above  expression  holds  for  all  ip  E  T2(Mn), 
with  absolute  convergence  of  the  integral  for  every  x0  E  Mn.  Applying  the 
Trotter  product  formula  and  rearranging  the  integral  as  before  gives 


(e-^/V)(x0) 


=  lim  C 

TV— )>oo 


1 


N 


exp 


n\N 


X 


£ 


)"  1  "3  =  1 

x  ^(xtv)  dx i  dx 2  •  •  •  dxN 


m 

~2 


xi 


Xj-l 


£ 


(20.14) 


If  V  is,  say,  bounded  below,  then  there  is  no  difficulty  in  changing  the 
order  of  integration,  because  of  the  rapid  decay  of  the  integrand.  Note  that 
there  is  a  relative  sign  change  between  the  two  terms  in  square  brackets, 
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compared  to  (20.9).  Taking  a  formal  limit  as  before  gives 


(paths  with 
x(0)=x0 


exp 


dx 

ds 


+  V(x(s)) 


^(x(t))  Tx. 


(20.15) 


Note  that  the  integral  in  the  exponent  on  the  right-hand  side  is  not  the 
classical  action  in  (20.11),  because  the  potential  term  has  the  wrong  sign. 

Kac’s  idea  was  to  separate  out  the  quadratic  part  of  the  exponent  on  the 
right-hand  side  of  (20.15)  and  attempt  to  interpret  the  expression 


C  exp 


Tx 


(20.16) 


as  a  measure  on  the  space  of  paths.  Specifically,  this  is  a  Gaussian  measure, 
one  with  a  (formal)  density  with  respect  to  the  Lebesgue  measure  that  is 
the  exponential  of  a  quadratic  expression.  There  is  a  well-developed  the¬ 
ory  of  Gaussian  measures  on  infinite-dimensional  spaces.  Although  there 
is  no  Lebesgue  measure  in  the  infinite-dimensional  case,  one  can  construct 
Gaussian  measures  as  limits  of  Gaussian  measures  on  spaces  of  large  finite 
dimension. 


20.4  The  Wiener  Measure 


Kac  identified  the  formal  expression  in  (20.16)  as  the  Wiener  measure.  To 
be  precise,  for  each  fixed  xq  G  R,  there  is  a  Wiener  measure  /xXo,  where  /xXo 
is  supported  on  the  set  of  paths  x  :  [0,  t\  — >  R  with  x(0)  =  xq.  The  Wiener 
measure  was  developed  by  Norbert  Wiener  as  a  rigorous  embodiment  of 
Albert  Einstein’s  mathematical  model  of  Brownian  motion.  Einstein,  in  one 
of  his  1905  papers,  had  proposed  that  the  random  motion  of  a  very  small 
particle  in  water  was  due  to  collisions  between  the  particle  and  the  water 
molecules.  Einstein  postulated  that  the  increments  of  a  Brownian  path 
x  [quantities  of  the  form  x(£)  —  x(s)]  should  be  independent  for  disjoint 
time  intervals  and  should  be  normal  random  variables  with  mean  zero  and 
variance  proportional  to  t  —  s.  The  following  theorem  shows  that  there 
is  a  unique  measure  on  the  space  of  continuous  paths  satisfying  Einstein’s 
criteria.  Let  CXo([0,  t\;  Mn)  denote  the  space  of  continuous  maps  x(-)  of  [0,  t\ 
into  Mn  satisfying  x(0)  =  xq,  equipped  with  the  supremum  norm. 


Theorem  20.2  (Wiener)  For  each  vector  xo  G  Mn  and  each  pair  of  pos¬ 
itive  numbers  a  and  t,  there  exists  a  unique  measure  /i£  on  the  Borel  cr- 
algebra  in  CXo([0,  £];  Mn)  such  that  the  following  condition  holds.  For  each 
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sequence  0  =  to  <  t\  <  •  •  •  <  tjy  <  t  of  real  numbers  and  each  non-negative 
measurable  function  f  on  (Rn)N ,  we  have 


Cx0  ([0,t];Mn) 


/(x(t i),x(i2),...,x(tjv))  ^Mx0(x) 


=  C  I  exP  .  9/r 

kn  I  ^ 


i  AT  I  2 

1  y-  |xj  -xj-i 
A?  —  tj-i 


3  = 1 


/(x i,x2,  •  •  •  ,xjv)  dxi  •  •  •  dxAT, 

(20.17) 


where 


N 


1 


£7  —  TT _ 

7=  i  -FT 


Note  that  the  right-hand  side  of  (20.17)  is  extremely  similar  to  the  right- 
hand  side  of  (20.14),  except  that  there  are  no  terms  involving  the  potential 
V  in  the  exponent  in  (20.17).  Thus,  it  is  reasonable  to  think  that  the  Wiener 
measure  is  a  rigorous  version  of  the  formal  expression  in  (20.16).  It  should 
be  noted,  however,  that  the  heuristic  expression  (20.16)  is  misleading  in  one 
important  respect.  That  expression  suggests  that  the  measure  is  supported 
on  paths  x(-)  for  which  dx/d£  belongs  to  L2([0,  t\;  Mn),  since  the  exponential 
factor  would  seemingly  “damp  out”  any  paths  for  which  this  is  not  the  case. 
This  conclusion  is,  however,  incorrect.  [One  should,  in  general,  be  extremely 
cautious  in  drawing  conclusions  based  on  purely  formal  expressions  such  as 
the  one  in  (20.16).]  Actually,  the  “typical”  path  with  respect  to  the  Wiener 
measure  is  nowhere  differentiable;  that  is,  the  set  of  paths  x(t)  that  are 
differentiable  for  even  one  value  of  t  form  a  set  of  measure  zero. 

This  discrepancy  is  actually  a  general  feature  of  Gaussian  measures  on 
infinite-dimensional  spaces:  They  are  always  supported  on  a  larger  space 
than  the  formal  expression  would  suggest.  In  the  case  of  the  Wiener  mea¬ 
sure,  the  space  on  which  the  measure  actually  lives  (the  space  of  continuous 
functions)  is  nice  enough  that  no  difficulties  arise  in  the  formulation  of  our 
main  result,  the  Feynman-Kac  formula.  In  the  setting  of  quantum  field  the¬ 
ory,  however,  issues  concerning  the  support  of  a  Gaussian  measure  become 
serious  difficulties.  See  Sect.  20.6  for  more  information. 


20.5  The  Feynman-Kac  Formula 

The  Wiener  measure  gives  a  rigorous  interpretation  to  the  expression  in 
(20.16).  Thus,  the  Wiener  measure  encapsulates  everything  in  (20.15)  ex¬ 
cept  for  the  term  involving  V  in  the  exponent  and  the  factor  of 
This  reasoning  accounts  for  the  form  of  the  following  result. 
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Theorem  20.3  (Feynman— Kac  Formula)  Suppose  V  :  M3  — >  R  can  be 

expressed  as  the  sum  of  a  function  in  L2(M3)  and  a  bounded  function.  Then 
for  all  xo  gM3,  we  have 

(e-td/V)(x  o) 

=  [  exp{-I  [  V(x(s))  ds}v,(x(i))  d/40(xt 

yCxo([0,tl;R3)  l  ^^0  J 

where  /i£  zs  t/ie  Wiener  measure  on  CXo([0,  t] ;  M3)  and  where  a  =  h/m. 

Of  course,  similar  results  hold  in  other  dimensions,  under  suitable  as¬ 
sumptions  on  the  potential.  We  refer  the  interested  reader  to  [37]  or  [14] 
for  details  on  different  versions  of  the  Feynman-Kac  formula.  Theorem  20.3 
cannot  be  obtained  directly  from  the  Trotter  product  formula,  because  the 
limit  in  (20.14)  is  an  L 2  limit  rather  than  a  pointwise  limit.  We  will  con¬ 
tent  ourselves  with  proving  an  “integrated”  version  of  the  Feynman-Kac 
formula  for  nice  potentials;  Theorem  20.3  is  Theorem  6.5  of  [37]. 

Definition  20.4  Let  C([0,£];Mn)  denote  the  space  of  all  continuous  paths 
on  [0,£]  with  values  in  Mn.  For  all  a  >  0,  let  fT7  be  the  measure  on 
mt\  ;  Mn)  given  by 

KE)  =  [  Mxo  (E)  dx 0. 

Jr n 

Proposition  20.5  Suppose  V  :  Mn  R  is  bounded  and  continuous.  Then 
for  all  (j),  if  G  L2(Mn),  we  have 


{(b,e 


-tH 


1 


'C([0,t];R”) 


0(x(O))  exp  --  /  K(x(s))  ds  }  if  (x(t))  d/ra(x), 


where  /j,a  is  as  in  Definition  20.4  and  where  a  =  h/m. 

Proof.  We  begin  with  (20.14)  and  apply  Theorem  20.2  with  parameters 
chosen  as  follows.  We  take  a  =  h/m,  we  take  the  sequence  (t3)  to  be  given 
by  tj  =  jt/N ,  and  we  take  /  to  be  the  function  given  by 


/(x  1,X2, . . .  ,xjv)  =  ^(xat). 


Theorem  20.2  then  allows  us  to  express  the  right-hand  side  of  (20.14)  as 
an  integral  against  the  Wiener  measure,  giving 


lim  /  exp 

N^°°  J CXQ  ([0,t];Rn) 


V>(x(0)  ^x0(X)' 
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Since  the  limit  in  the  above  equation  is  an  L 2  limit,  we  may  move  the 
inner  product  with  <fi  inside  the  limit  on  the  right-hand  side.  The  integral 
with  respect  to  /x£0  and  the  integral  with  respect  to  dx o  may  then  be 
combined  into  a  single  integral  with  respect  to  /j,a ,  giving 

(4>,  e~tH/‘ V)  =  lim  [  <Hx(0)) 

xexp|-^^  (X  (^))  |  (20.18) 
Now,  since  V  is  continuous, 

hv  (x (» ))  =  l V(x(s))  *• 

for  every  continuous  path  x.  Furthermore,  it  is  easily  seen  that  the  “distri¬ 
bution”  of  the  quantity  x(s)  with  respect  to  the  measure  i±G  is  the  Lebesgue 
measure  on  Mn,  for  any  s  G  [0, t] .  Thus,  the  function  x  i— >>  0(x(O))  is 
square-integrable  with  respect  to  /i"7,  with  L 2  norm  equal  to  the  L2  norm 
of  (j)  over  Mn,  and  similarly  for  x  i— >>  ip(x(t)).  It  follows  that  the  quantity 
0(x(O))'0  (x(t))  is  an  L 1  function  on  C([0,£]  ;Mn).  Since  V  is  bounded,  we 
may  apply  dominated  convergence  to  move  the  limit  inside  the  integral,  at 
which  point  we  obtain  the  desired  result.  ■ 


20.6  Path  Integrals  in  Quantum  Field  Theory 


In  this  section,  we  briefly  discuss  the  path  integral  approach  to  quantum 
field  theory.  We  consider  quantum  field  theory  in  a  space-time  of  dimension 
d,  so  that  space  has  dimension  d—  1.  The  configuration  space  for  the  classical 
version  of  the  theory  is  the  collection  of  “spatial”  fields,  that  is,  maps  </>(x) 
of  Wd~1  into  some  finite-dimensional  vector  space  V.  A  path  in  the  space 
of  fields  is  then  a  map  0(x,  t)  of  M^-1  x  R  =  Rd  into  V.  In  the  path 
integral  approach  to  quantum  field  theory  (which  is  the  most  commonly 
used  approach  to  the  subject),  one  considers  integrals  over  the  space  of 
such  paths. 

Let  us  consider,  as  a  simple  example,  what  is  called  04  theory.  In  this 
theory,  the  fields  0  map  into  R  and  we  consider  a  path  integral  of  the  form 


C  I  exp 
x  F(0)  V(t>, 


1 

h 


Cl  ||V0(x)||2  +  c20(x)2  +  c3</>(x)‘ 


dx 


(20.19) 


for  some  functional  F{cj>)  on  the  space  of  fields.  [The  expression  in  (20.19) 
is,  more  precisely,  a  “Euclidean”  or  “imaginary  time”  path  integral.  Such 
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an  integral  is  the  counterpart  in  quantum  field  theory  of  the  integral  occur¬ 
ring  in  the  Feynman-Kac  formula  in  quantum  mechanics.]  In  (20.19),  J~d 
represents  the  space  of  all  “fields”  (i.e.,  functions)  mapping  our  space-time 
into  R.  In  an  attempt  to  make  sense  of  this  heuristic  expression,  we 
may  follow  the  strategy  we  used  in  deriving  the  Feynman-Kac  formula  by 
separating  out  the  quadratic  part  of  the  exponent.  We  look,  then,  for  a 
measure  p  on  T&  given  by  the  heuristic  expression 


Cl  ||V</>(x)||2  +  c2(/>(x)2 


(20.20) 


Using  the  theory  of  Gaussian  measures,  one  can  construct  a  rigorously 
defined  measure  corresponding  to  the  heuristic  expression  in  (20.20).  There 
is,  however,  a  serious  difficulty  with  this  approach:  The  measure  p  is  sup¬ 
ported  on  very  “rough”  fields,  much  rougher  than  the  heuristic  expression 
suggests.  In  fact,  we  have  the  following  result. 

Proposition  20.6  For  all  d  >  1,  there  exists  a  Gaussian  measure  on  the 
space  Td  of  fields  on  corresponding  to  the  heuristic  expression  (20.20). 
For  d  >  2,  however ,  this  measure  is  not  supported  on  any  space  of  ordinary 
functions,  but  rather  on  a  space  of  distributions. 

We  will  not  prove  this  result  here;  see  Sect.  8.5  of  [14]  for  more  informa¬ 
tion.  Here,  then,  is  the  problem  with  the  path  integral  approach  to  quantum 
field  theory  on  space-times  of  dimension  d  >  2:  The  functional  fRd  <f(x)4  dx 
does  not  make  sense  for  a  “typical”  field  with  respect  to  the  measure  p  in 
(20.20).  As  a  result,  we  cannot  make  sense  of  (20.19)  simply  by  absorbing 
all  the  Gaussian  part  into  the  definition  of  the  measure  /i,  since  what  is 
left  over  is  not  a  /i-almost  everywhere  defined  functional  of  <f>.  Indeed,  even 
a  local  integral,  of  the  form  fu  <f(x)4  dx  for  some  bounded  region  U  in 
fails  to  be  almost-every where  defined  with  respect  to  fi.  After  all,  if 
fjj  <f(x)4  dx  made  sense,  then  <p  would  be  a  locally  L4  function,  rather  than 
a  distribution. 

It  should  be  emphasized  that  the  difficulty  described  in  the  previous 
paragraph  is  not  just  a  technicality  that  can  be  swept  away  by  some  simple 
trick.  Furthermore,  this  difficulty  is  not  specific  to  <f4  theory,  but  is  present 
in  ah  “nontrivial”  held  theories.  In  all  interesting  held  theories,  the  helds 
dehned  by  the  Gaussian  part  of  the  path  integral  are  fundamentally  “too 
rough”  to  allow  us  to  make  sense  of  the  non- Gaussian  part  of  the  integral. 
This  phenomenon  is  the  fundamental  mathematical  difficulty  in  the  path 
integral  approach  to  quantum  held  theory. 

To  have  a  chance  to  make  rigorous  sense  of  path  integrals  in  quantum 
held  theory,  one  has  to  employ  a  complicated  regularization  process  known 
as  renormalization.  This  process  has,  so  far,  been  carried  out  in  a  rigorous 
fashion  only  for  a  very  small  number  of  held  theories.  One  of  the  Clay 
Millennium  Prize  problems  is  to  make  rigorous  sense  out  of  the  Yang-Mills 


20.7  Exercises  453 


field  theory  in  four  space-time  dimensions.  See  [14]  for  a  detailed  survey 
of  the  mathematical  issues  connected  with  the  path  integral  approach  to 
quantum  field  theory.  See  also  [13]  for  a  treatment  of  quantum  field  theory 
and  renormalization  with  a  greater  eye  toward  the  physical  content. 

Since  the  roughness  of  the  fields  is  a  major  problem  in  trying  to  give  a 
rigorous  meaning  to  path  integrals,  let  us  think  for  moment  why  it  arises. 
Suppose  we  wish  to  construct  a  Gaussian  measure  from  a  certain  heuristic 
expression  of  the  form  //  =  Ce~^^Vx,  where  Q  is  a  positive-definite 
quadratic  functional  of  x.  A  reasonable  approach  is  to  consider  the  (real) 


Hilbert  space  H  for  which 


x 


H 


=  Q(x).  [In  the  case  of  (20.20),  H  would 


be  the  “Sobolev  space”  of  fields  having  one  derivative  in  L2.]  The  heuristic 
expression  for  the  Gaussian  measure  then  takes  the  form 


d/i(x)  =  Ce  Vx. 


(20.21) 


One  might  now  try  to  approximate  fi  by  Gaussian  measures  /j,jy  on 
Hilbert  spaces  H^v  of  dimension  N  <  oo.  If  dim  H  <  oo,  then  the  expres¬ 
sion  (20.21)  is  perfectly  rigorous,  where  the  constant  C  may  be  taken  to 
normalize  /x  to  be  a  probability  measure.  A  simple  calculation  (Exercise  4), 
however,  shows  that  for  any  i7,  we  have 


lim  Hn(Br,n)  =  0, 

N^-oo 


where  Br^n  denotes  the  ball  of  radius  R  in  H^y.  This  means  that  in  the 
N  oo  limit,  all  of  the  “mass”  of  the  measure  is  outside  the  ball  of  radius 
i7,  for  every  R.  Thus,  in  the  limit,  the  measure  is  supported  entirely  on 


points  x  where 


x 


H 


=  oo,  that  is,  on  points  that  are  not  actually  in  H. 


The  measures  / 1 n  do  converge  to  a  measure  fi  as  N  tends  to  infinity,  but 
/i  does  not  live  on  H,  but  on  some  larger  space  B  D  H.  The  original  space 
H  is  a  set  of  /i-measure  zero  inside  B.  See  [16]  for  more  information.  In  the 
case  of  the  measure  //  corresponding  to  the  heuristic  expression  in  (20.20), 
H  does  not — as  the  expression  suggests — live  on  the  Sobolev  space  of  fields 
with  one  derivative  in  L2,  but  on  a  larger  space,  which  turns  out  to  be  a 
space  of  distributions. 


20.7  Exercises 


1.  Verify  the  identity  (20.3)  in  the  proof  of  the  Trotter  product  formula. 

2.  Verify  (20.5)  in  the  proof  of  the  Trotter  product  formula,  using  Stone’s 
theorem  and  the  following  identity: 


_  ^isA^isB 

s  v 


I)ip  —  iBip 


I)iP  =  eisA{iBip)  +  eisA 
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3.  Suppose  {An}  is  a  family  of  bounded  operators  mapping  a  Banach 
space  W\  to  a  Banach  space  W2 .  Suppose  that  for  some  constant  C, 
we  have  \\Ajst\\  <  C  for  all  N.  Finally,  suppose  that  || ^4 ivV7 II  0  as 
N  -G  00,  for  every  Q  G  FF. 


(a)  Show  that  for  each  vp  e  W  and  each  e  >  0,  there  exists  a  neigh¬ 
borhood  U  of  'ip  and  an  integer  M  such  that 

||4/v0||  <  ^ 

for  all  (j)  G  U  and  N  >  M. 

(b)  If  K  is  a  compact  subset  of  FF,  show  that  ||4jv^||  tends  to  zero 
uniformly  for  G  K. 


4. 


Let  Hat  be  an  TV-dimensional  Hilbert  space.  Show  that  the  mea¬ 
sure 

dfijsr{x)  :=  N~NAe-\\x\\  yx 


is  a  probability  measure.  Here  dx  is  the  Lebesgue  measure  on 
Hv,  normalized  to  that  the  unit  cube  has  volume  1. 

Hint :  Use  Proposition  A. 22. 

(b)  Let  Bji  n  denote  the  ball  of  radius  R  in  Hjv-  Show  that  for  each 
R  <  00,  there  exists  number  clr  <  1  such  that 


^n{Br,n)  <  (&r)N • 


Thus,  liuiTv^oo  Hn{Br,N)  =  0. 

Hint :  The  ball  Brn  is  contained  in  a  cube  centered  at  the  origin 
with  side  length  2 R. 


21 

Hamiltonian  Mechanics  on  Manifolds 


In  this  chapter,  we  generalize  the  Hamiltonian  approach  to  mechanics  (in¬ 
troduced  already  in  the  Euclidean  case  in  Sect.  2.5)  to  general  manifolds. 
The  chapter  assumes  familiarity  with  the  basic  notions  of  smooth  mani¬ 
folds,  including  tangent  and  cotangent  spaces,  vector  fields,  and  differen¬ 
tial  forms.  These  notions  are  reviewed  very  briefly  in  Sect.  21.1,  mainly  in 
the  interest  of  fixing  the  notation.  See,  for  example,  Chap.  2  of  [40]  for  a 
concise  treatment  of  manifolds  and  [29]  for  a  detailed  account.  Throughout 
the  chapter,  we  will  use  the  summation  convention ,  that  repeated  indices 
are  always  summed  on. 


21.1  Calculus  on  Manifolds 

Throughout  this  section,  M  will  denote  a  smooth,  n-dimensional  manifold. 


21.1.1  Tangent  Spaces,  Vector  Fields,  and  Flows 

For  each  x  E  M,  we  have  the  tangent  space  to  M  at  x,  denoted  TXM.  Given 
a  smooth  coordinate  system  x\, . . . ,  xn  on  M,  the  vectors 


d  d 

~dVF'"'~d^n 


(21.1) 


form  a  basis  for  the  tangent  space  at  each  point.  A  vector  field  X  on  M 
is  map  assigning  to  each  point  x  E  M  an  element  Xx  of  TXM.  A  vector 
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field  X  is  smooth  if  the  coefficients  of  X  in  a  basis  of  the  form  (21.1)  are 
smooth  functions,  for  every  smooth  coordinate  system.  As  in  Exercise  14 
in  Chap.  2,  we  think  of  a  vector  field  as  a  first-order  differential  operator 
satisfying  the  Leibniz  rule: 


X(fg)=X(f)g  +  fX(g). 


Given  a  smooth  vector  field  X  on  M  and  a  point  x  E  M,  there  exists  a 
curve  7 x  :  (a,  b)  M  such  that  7^(0)  =  x  and 


djx 

dt 


Any  two  such  curves  agree  on  the  intersection  of  their  intervals  of  definition. 
There  is  a  largest  interval  (a™ax,  6™ax)  on  which  such  a  curve  can  be  defined. 
If,  for  each  x  E  M,  we  have  a™ax  =  —  oo  and  5™ax  =  Too,  we  say  that  the 
vector  field  X  is  complete.  If  M  is  compact,  then  each  smooth  vector  field 
on  M  is  complete.  We  may  assemble  the  curves  into  the  flow  <f>  generated 
by  X,  defined  as 

®t(x)  = 

whenever  a™ax  <  t  <  5™ax.  If  t  does  not  belong  to  (a™ax,  5™ax),  then  $t(%) 
is  not  defined.  The  flow  <f>  satisfies 


T0(x)  =  x.  (21.2) 

Furthermore,  if  x  is  in  the  domain  of  <fq  and  is  in  the  domain  of  <hs, 

then  x  is  in  the  domain  of  and 

$s($t(x))  =  ®a+t(x).  (21.3) 

In  the  other  direction,  given  a  family  of  maps  <f>  satisfying  (21.2)  and 
(21.3)  and  appropriate  domain  properties,  there  is  a  unique  vector  field  X 
such  that  T  is  the  flow  generated  by  X.  In  particular,  if  &t{%)  is  defined 
for  all  x  and  £,  is  smooth  as  a  map  of  M  x  R  into  M,  and  satisfies  (21.2) 
and  (21.3),  there  is  a  unique  complete  vector  field  X  such  that  T  is  the 
flow  generated  by  X. 


21.1.2  Differential  Forms 

For  each  x,  the  tangent  space  TXM  is  an  n-dimensional  real  vector  space. 
The  dual  vector  space  to  TXM  is  the  cotangent  space  to  M  at  x,  denoted 
T*M.  Given  a  smooth  function  /  on  M  and  a  point  x  E  M,  the  differential 
of  /  at  x  is  the  element  of  T*M  given  by 


df(X)  =  X(f) 
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for  each  X  E  Txf.  In  particular,  in  any  local  coordinate  system  aq, . . . ,  xn, 
the  elements  dx i, . . . ,  dxn  satisfy 

dXj  {d^J  =  Sjk' 

Thus,  the  elements  dx i, . . . ,  dxn  form  a  basis  for  T*M  at  each  point.  For 
any  smooth  function  /,  we  have 

df  =  ^ f~dx3 ■  (2L4) 

A  k-form  a  on  M  is  a  mapping  assigning  to  each  point  x  E  M  a  ^-linear, 
alternating  functional  ax  on  TXM.  A  k- form  is  smooth  if  <a(Xi, . . . ,  X&)  is  a 
smooth  function  on  M  for  each  fc-tuple  of  smooth  vector  fields  Xi, . . . ,  X& 
on  M.  In  particular,  if  /  is  a  smooth  function,  then  df  is  a  smooth  1-form. 
If  a  is  a  smooth  k- form  and  X  a  smooth  vector  field,  we  may  define  the 
contraction  of  a  with  X,  which  is  the  (k  —  l)-form  ix&  given  by 

(ixOi)(Xi, . . . ,  Xfc_i)  =  a(X,  Xi,...,  Xfc_i). 

Given  a  ^-linear  form  p  on  a  vector  space  V,  define  the  antisymmetriza- 
tion  AS(p)  of  p  by 

where  Sk  denotes  the  permutation  group  on  k  elements.  Given  a  k- form  a 
and  an  l- form  [3  on  M,  let  a  G  /3  be  the  (fc  +  Z)-linear  form  on  each  TXM 
given  by 

(a  ®  /3)(Xi, . . . ,  Xfc+^)  =  a(Xi, . . . ,  X/e)/3(X/c+i, . . . ,  Xfc+/). 

Then  let  a  A  (3  denote  the  (k  +  Z)-form  given  by 

a  A  /3  =  AS(a  ®  (3). 

In  particular,  if  a  and  [3  are  1-forms,  then  a  A  /?  is  the  2-form  given  by 

(a  A  /3)(X,  Y)  =  a(X)/3(Y)  -  a{Y)/3{X). 

In  a  smooth  coordinate  system  aq, . . .  ,xn,  a  smooth  /c-form  a  can  be  ex¬ 
pressed  uniquely  as 


a  =  "  n . ,h  (x)  dxh  A  •  •  •  A  . 

A  2-form  ca  on  M  is  said  to  be  nondegenerate  if  uj  defines  a  nondegenerate 
bilinear  form  on  each  TXM.  More  explicitly,  this  means  that  for  each  x  E  M 
and  each  nonzero  X  E  TXM ,  there  exists  a  Y  E  TXM  such  that 


Gj(x,y)  ^  o. 
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Suppose  a  is  a  smooth  /c-form  on  M  and  S'  is  a  compact,  oriented,  k- 
dimensional  submanifold-with-boundary  of  M.  Then  one  can  define  the 
integral  of  a  over  M.  There  is  a  map  d,  called  the  exterior  derivative , 
mapping  smooth  k- forms  to  smooth  (k  +  l)-forms  and  having  the  property 
that 


(21.5) 


for  every  compact,  oriented,  /c-dimensional  submanifold-with-boundary  S 
of  M  and  every  (fc  —  l)-form  [3  on  M.  Here  dS  is  the  boundary  of  S,  with  the 
natural  orientation  induced  by  the  orientation  on  M.  The  relation  (21.5)  is 
known  as  Stoke’s  theorem.  A  k- form  a  is  said  to  be  closed  if  da  =  0. 

The  exterior  derivative  may  be  computed  in  coordinates  by  the  formula 


d(f  dxj1  A  •  •  •  A  dxjk ) 


es_ 

dx  i 


dxi  A  dxj1  A  •  •  •  A  dxjk 


A  coordinate-invariant  formula  for  the  exterior  derivative  of  a  k- form  a  is: 


fc+i 


da(X i, . . . ,  Xk+i)  =  YX-iy^ajXu  . . . ,  Xj, . . . ,  Xk+i) 

3  = 1 

+  ,Xt],  -V; . V, . W+i), 


3<l 


where  X3  indicates  that  the  X3  term  is  omitted  and  where  [Aj,  X{\  is  the 
commutator  of  X3  and  Xi  as  first-order  differential  operators.  In  particular, 
if  a  is  a  1-form,  we  have 


(da)(X,  Y )  =  X(a(Y))  -  Y(a(X))  -  a([X,  Y]). 


A  key  identity  satisfied  by  the  exterior  derivative  is 


(21.6) 


d(da)  =  0 


for  all  k- forms  a.  Conversely,  if  (3  is  a  closed  (/c  +  l)-form  (i.e.,  d/3  =  0),  then 
/ 3  can  be  expressed  locally  in  the  form  f3  =  da  for  some  k- form  a.  More 
precisely,  if  (3  is  closed,  then  for  any  x  E  M  there  exists  a  neighborhood  U  of 
x  and  a  k- form  a  defined  on  U  such  that  (3  =  da  on  U.  If  M  satisfies  certain 
topological  conditions,  then  each  closed  k- form  a  on  M  can  be  expressed 
globally  in  the  form  a  =  dp.  In  particular,  if  M  is  simply  connected,  then 
each  closed  1-form  f3  can  be  expressed  globally  in  the  form  (3  =  df  for  some 
smooth  function  (i.e.,  0-form)  /. 

If  A  is  a  vector  field  and  a  is  a  k- form,  we  may  define  the  Lie  derivative 
of  a  in  the  direction  of  A,  denoted  as  follows: 


jC,x& 


>»i 
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where  <fq  is  the  flow  generated  by  X  and  (3>£)(a)  is  the  pullback  of  a  by 
4 The  Lie  derivative  may  be  computed  using  the  formula 

Cx  =  ix  °  d  d  o  %x .  (21.7) 


21.2  Mechanics  on  Symplectic  Manifolds 

The  reader  is  warned  that  sign  conventions  in  the  subject  of  Hamiltonian 
mechanics  are  not  consistent  from  author  to  author. 

21.2.1  Symplectic  Manifolds 

A  symplectic  manifold  is,  roughly,  a  manifold  with  enough  additional  struc¬ 
ture  to  allow  one  to  define  the  Poisson  bracket  of  two  functions. 

Definition  21.1  A  symplectic  manifold  is  a  smooth  manifold  N  to¬ 
gether  with  a  closed,  nondegenerate  2-form  uj  on  N.  If  and  (A^,  ab) 

are  symplectic  manifolds,  a  map  4>  :  N 1  N2  is  a  symplectomorphism 
if  4>  is  a  diffeomorphism  and  in  addition 

T*(ce2)  =  kb¬ 
it  is  not  hard  to  see  that  every  symplectic  manifold  must  be  even  dimen¬ 
sional,  for  the  simple  reason  that  an  odd-dimensional  vector  space  does  not 
admit  a  nondegenerate,  skew-symmetric  bilinear  form. 

Throughout  this  chapter,  N  will  always  denote  a  symplectic  manifold  of 
dimension  2 n  with  symplectic  form  uj. 

We  now  show  that  the  cotangent  bundle  of  any  manifold  has  the  struc¬ 
ture  of  a  symplectic  manifold  in  a  canonical  way.  Suppose  x\,...,xn  is 
a  coordinate  system  defined  on  an  open  set  U  C  M.  Then  at  each  point 
x  G  U,  an  element  <f>  of  T*M  can  be  expressed  uniquely  in  the  form 

cj)  =  pj  dxj 

for  a  sequence  pi, . . .  ,pn  of  real  numbers.  The  quantities  x\, . . . ,  xn  and 
Pi, . . .  ,pn  constitute  a  coordinate  system  on  7 r-1(I/).  We  refer  to  a  coordi¬ 
nate  system  of  this  sort  as  a  standard  coordinate  system  on  T*M. 

Example  21.2  For  any  smooth  manifold  M,  define  a  1-form  0  on  the 
cotangent  bundle  T*M  by 


9{X){x^)  =  <t>{  7T*(X)) 

for  each  tangent  vector  X  E  T(X^(T*M),  where  n  :  T*M  M  is  the 
canonical  projection.  Then  the  2- form  u  :=  dO  is  closed  and  nondegenerate. 
We  refer  to  0  and  uj  as  the  canonical  1-form  and  the  canonical  2- form  on 
T*M ,  respectively . 
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Proof.  Using  a  coordinate  system  {xj}  on  X  and  the  associated  stan¬ 
dard  coordinate  system  {xj,pj}  on  T*M,  the  projection  7 r  is  given  by 
7 r(x,p)  =  x.  Meanwhile,  a  tangent  vector  X  to  T*M  is  expressible  as  a 
linear  combination  the  d/dxj’s  and  d/dpj’s.  Thus, 

"  (atJk + = irs  dXs)  (--A)  - 

What  this  means  is  that 

6  =  Pj  dxj , 

where  the  Xj’s  are  now  viewed  as  functions  on  T*M  rather  than  on  M.  We 
have,  then, 

uj  =  dO  =  dpj  A  dxj . 

It  is  now  easy  to  see  that  uj  is  nondegenerate  (Exercise  1).  ■ 

21.2.2  Poisson  Brackets  and  Hamiltonian  Vector  Fields 

If  uj  is  nondegenerate,  then  it  gives  a  canonical  identification  of  TZN  with 
T*N  at  each  point,  by  identifying  a  vector  X  in  TZN  with  the  linear  func¬ 
tional  uj(X,  •)  in  TIN.  We  can  then  transfer  the  bilinear  form  c 0  from  TZN 
to  T*N  by  means  of  this  identification.  We  denote  the  resulting  bilinear 
form  on  T*N  by  uj-1  . 

Definition  21.3  If  f  and  g  are  smooth  functions  on  TV,  define  the  Pois¬ 
son  bracket  {f,g}  °f  f  an(l  9  by 

{ f,9 }  =  -u~1{df,dg). 

In  particular,  if  1  denotes  the  constant  function  on  TV,  then  {1 , /}  = 
{/,  1}  =0  for  all  smooth  functions  /. 

Example  21.4  If  uj  is  the  canonical  2- form  on  T*M ,  then  the  associated 
Poisson  bracket  may  be  computed  in  standard  coordinates  as 

Sf  1  _  d/  9 9  df  dg 

’  ^  dxj  dpj  dpj  dxj 

for  all  smooth  functions  f  and  g  on  T*M. 

Proof.  The  linear  functional 


UJ 


d 

dxj' 


has  a  value  of  —1  on  the  vector  d/dpj  and  a  value  of  0  on  all  the  other 
basic  partial  derivatives.  This  means  that  uj{8 / dxj,  •)  =  —dpj.  Similarly, 


21.2  Mechanics  on  Symplectic  Manifolds  461 


iu(d /dpj,  •)  =  dxj.  We  may  thus  compute,  for  example,  that 


1 


d  d 


(jj 


dxj  ’  dp. 


uo  1(— dpj,  dxj) 
uj~1(dxj ,  dpj). 


Meanwhile,  oj  1(dxj,  dxk)  =  co  1(dpj,dpk)  =  0  and  uj  1(dpj,  dxk)  =  0 
when  j  ^  k.  Thus,  we  compute  that 


{f,g}  =  -w 


df  df  dg  dg 

dxj  +  — — dpj ,  — — dxk  +  — — dpk 


dxj 


dp.. 


dx 


k 


dpk 


df  dg  df  dg 

°3k  °3k-> 


dxj  dpk 


dpj  dxk 


which  reduces  to  the  claimed  expression.  ■ 

Proposition  21.5  For  any  smooth  functions  f,g,  h  on  N,  we  have 


and 


{g,f}  =  ~{f,g} 


{f,  gh}  =  {/,  g}h  +  g{f,  h}. 


Proof.  Since  uo  is  skew-symmetric  on  the  tangent  space  to  N  at  each  point 
and  uo~l  is  obtained  from  c o  by  means  of  an  isomorphism  of  tangent  and 
cotangent  space,  cj-1  is  a  skew-symmetric  form  on  the  cotangent  space.  The 
skew-symmetry  of  the  Poisson  bracket  follows.  The  second  relation  follows 
from  the  Leibniz  product  rule  for  d(gh)  together  with  the  bilinearity  of 


L V 


-1 


Definition  21.6  If  f  is  a  smooth  function  on  TV,  let  Xf  be  the  unique 
vector  field  on  N  such  that 


df  =  •).  (21.8) 

We  call  Xf  the  Hamiltonian  vector  field  associated  to  /. 

That  is  to  say,  Xf  corresponds  to  df  under  the  isomorphism  between 
tangent  and  cotangent  spaces  established  by  uj. 

Proposition  21.7  For  all  f  and  g , 

xf(g)  =  {f,g}  =  -xg{f). 

Furthermore, 

u;(Xf,Xg)  =  -{f,g}. 
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Proof.  For  each  z  £  TV,  we  are  using  tv  to  identify  TZN  with  T*N.  Equa¬ 
tion  (21.8)  says  that  under  this  identification,  Xf  is  identified  with  df. 
Thus, 

-w~\df,dg)  =  -w(Xf,Xg)  =  —df(Xg)  =  -Xg(f). 

Thus,  {/,  g}  =  — Xg(f ),  as  claimed.  A  similar  argument  with  the  roles  of 
/  and  g  reversed  gives  the  claimed  relationship  between  Xf(g)  and  {$,/}. 

Finally, 

u(Xf,Xg)  =  df(Xg)  =  Xg(f)  = 

as  claimed.  ■ 

Definition  21.8  For  any  smooth  function  f  on  N ,  the  Hamiltonian 
flow  generated  by  /,  denoted  <f>^,  is  the  flow  generated  by  the  vector  field 

-Xf- 

In  the  case  N  =  T*Mn  =  M2n,  this  definition  agrees  with  our  notation  in 
Sect.  2.5. 


Proposition  21.9  For  any  smooth  function  f  on  N ,  the  Hamiltonian  flow 
preserves  tv. 


Proof.  In  general,  a  flow  T  preserves  a  differential  form  a  if  and  only  if 
the  Lie  derivative  Lxcx  =  0,  where  X  is  the  vector  field  generating  <f>.  In 
our  case,  since  tv  is  closed,  we  have,  by  (21.7), 


Cxfu 


d[ixf  ^ 


d2  f  =  0, 


since  ixfw  is?  by  the  definition  of  Aj,  equal  to  df.  m 

Proposition  21.10  For  any  smooth  functions  f,g,h  on  N ,  the  Jacobi 
identity  holds: 


{/>  { 9 ,  h}}  +  {g,  {h,  /}}  +  {h,  {/,  g}}  =  0. 

This  result  shows  that  the  space  of  smooth  function  on  N  forms  a  Lie 
algebra  under  the  Poisson  bracket.  The  proof  of  Proposition  21.10  relies  on 
Proposition  21.9,  which  in  turn  relies  on  the  fact  that  tv  is  closed. 

Proof.  Since  the  Hamiltonian  flow  preserves  cj,  it  also  preserves  tv -1 
and  thus 

aj-1(d(g  o  <p{),  d(h  o  T{))  =  cj-1(dg,  dh)  o  , 
or,  equivalently, 

{5  0  ,  h  O  }  =  {g,h}  O  . 


Differentiating  this  relation  with  respect  to  t  at  t  =  0  gives 

{-Xf(g),  h}  +  {g,  -Xf{h}}  =  —Xf({g,  h}), 
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or,  equivalently, 


-{{/,  9},  h}  +  {g,  {/,  h}}  =  -{/,  {g,  h}}. 


After  moving  —  {/,  {g,h}}  to  the  left-hand  side  of  the  equation  and  using 
the  skew- symmetry  of  the  Poisson  bracket,  we  obtain  the  Jacobi  identity. 


Proposition  21.11  For  any  smooth  functions  f  and  g  on  N ,  the  Hamil¬ 
tonian  vector  fields  Xf  and  Xg  satisfy 


[Xf,Xg] 


Proof.  See  Exercise  3.  ■ 

21.2.3  Hamiltonian  Flows  and  Conserved  Quantities 

We  have  seen  (Proposition  21.9)  that  if  /  is  a  smooth  function,  then  the 
flow  generated  by  Xf  preserves  c 0.  We  have  the  following  partial  converse 
to  this  result. 

Proposition  21.12  Suppose  $  is  the  flow  generated  by  a  vector  field  —X 
on  N.  If  $  preserves  uj  then  X  can  be  represented  locally  in  the  form  X  = 
Xf  for  some  smooth  function  f  on  N.  If  N  is  simply  connected ,  the  function 
f  exists  globally  on  N. 

Proof.  The  statement  that  4>  preserves  uj  can  be  expressed  infinitesi¬ 
mally  as 

Fx^  0. 

Since  also  c 0  is  closed,  (21.7)  tells  us  that 

d(ixw)  =  0. 

Since  ixu  is  closed,  this  1-form  can  be  expressed  locally  as  ix^o  =  df  for 
some  smooth  function  /,  which  says  precisely  that  X  =  Xf.  If  N  is  simply 
connected,  then  every  closed  1-form  can  be  expressed  globally  as  d/,  for 
some  smooth  function  /.  ■ 

A  flow  of  the  sort  in  Proposition  21.12  is  said  to  be  locally  Hamiltonian. 
Such  a  flow  is  said  to  be  (globally)  Hamiltonian  if  the  function  /  in  the 
proposition  can  be  defined  on  all  of  N.  (Compare  Definition  21.8.)  If  T  is  a 
Hamiltonian  flow,  the  function  /  such  that  <f>  =  <f>^  is  called  a  Hamiltonian 
generator  of  T.  If  N  is  connected,  then  any  two  Hamiltonian  generators  of 
<f>  must  differ  by  a  constant. 

To  see  that,  in  general,  /  is  only  defined  locally,  consider  the  symplectic 
manifold  S'1  x  R,  with  symplectic  form  uj  =  dfiAdx,  where  <f>  is  the  angular 
coordinate  on  S'1  and  x  is  the  linear  coordinate  on  R.  Note  that  the  1-form 
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d(j)  is  independent  of  the  choice  of  a  local  angle  variable  on  S'1,  since  any  two 
such  angle  functions  differ  by  a  constant  (an  integer  multiple  of  2tt).  Thus, 
d<f>  is  a  globally  defined,  smooth  1-form,  even  though  there  is  no  globally 
defined,  smooth  angle  function  </>.  Define  a  flow  <f>  by 

$*(<£,  x)  =  (<j>,X  +  t). 

This  flow  certainly  preserves  cj,  since  dx  is  invariant  under  translations. 
The  flow  T  is  generated  by  the  vector  field  —X  =  d/dx,  and 


uj(—d/dx,  •)  =  dcj>. 


As  we  have  already  noted,  however,  there  is  no  globally  defined  function  <f 
whose  differential  is  dfi. 

Although  any  smooth  function  on  a  symplectic  manifold  N  generates  a 
Hamiltonian  flow,  in  physical  examples  there  is  usually  one  distinguished 
function  with  a  Hamiltonian  flow  that  is  thought  of  as  “the”  time-evolution 
of  the  system. 


Definition  21.13  A  Hamiltonian  system  is  a  symplectic  manifold  N 
together  with  a  distinguished  Hamiltonian  flow  ,  generated  by  smooth 
function  H  on  N ,  called  the  Hamiltonian  of  the  system.  A  function 
f  is  called  a  conserved  quantity  for  a  Hamiltonian  system  (TV,  <f>^)  if 
m?  (x))  is  independent  of  t  for  each  fixed  x  E  N. 

As  in  the  M2n  case,  conserved  quantities  are  useful  in  understanding  the 
nature  of  the  dynamics.  See  the  discussion  following  Corollary  2.26. 

Proposition  21.14  For  any  Hamiltonian  system  (TV,  $>H),  we  have 


If($?(z))  =  {f,H}($?(z)), 
for  all  zeN,  or,  more  concisely, 


df_ 

dt 


In  particular,  a  smooth  function  f  on  N  is  a  conserved  quantity  for  a 
Hamiltonian  system  if  and  only  if  {/,  H}  =  0. 


Proof.  For  the  flow  generated  by  any  vector  field  X ,  we  have 

If  (MV)  =  x*t(z)f. 

If  X  =  —Xf,  then  by  Proposition  21.7,  we  have  the  claimed  result.  ■ 

Proposition  21.15  A  smooth  function  f  is  a  conserved  quantity  for  a 
Hamiltonian  system  (TV,  &H)  if  and  only  if  H  is  invariant  under  the  Hamil¬ 
tonian  flow  <f>^  generated  by  /. 
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Proof.  By  the  previous  proposition,  H  is  invariant  under  the  flow  generated 
by  /  if  and  only  if  {if,  /}  =  0,  which  holds  if  and  only  if  {/,  H}  =  0,  which 
holds  if  and  only  if  /  is  a  conserved  quantity.  ■ 

21.2.4  The  Liouville  Form 

A  symplectic  manifold  N  has  a  natural  volume  form,  which  allows  us  to 
formulate  an  analog  on  N  of  Liouville’s  theorem  (Theorem  2.27). 

Definition  21.16  If  N  is  a  2n- dimensional  symplectic  manifold ,  the 
Liouville  form  on  N  is  the  2n-form  A  given  by 


where  uon  =  u  A  •  •  •  A  u). 

Since  uj  is,  by  assumption,  a  nondegenerate  form  on  each  tangent  space 
TZN,  it  is  not  hard  to  check  that  A  is  a  nonvanishing  (2n)-linear  form  on 
each  TZN.  Thus,  A  determines  an  orientation  on  N.  Given  a  compactly 
supported  continuous  function  /  on  TV,  we  can  define  the  integral  of  / 
over  TV,  computed  with  respect  to  the  orientation  determined  by  A  itself. 
Using  the  version  of  the  Riesz  representation  theorem  for  locally  compact 
topological  spaces,  one  can  show  that  there  is  a  unique  measure ,  called 
the  Liouville  volume  measure,  for  which  the  integral  of  every  continuous 
compactly  supported  function  /  is  given  by  fN  f  A. 

We  are  now  ready  to  state  the  general  form  of  Liouville’s  theorem. 

Theorem  21.17  (Liouville’s  Theorem)  For  any  smooth  function  f  on 
N ,  the  Hamiltonian  flow  preserves  A. 

Proof.  The  flow  will  preserve  A  if  and  only  if  the  vector  field  Xf  satisfies 
Cxf  A  =  0.  But 

1 

Txf  A  =  — -  \fC,Xfbj')  A  (jj  A  •  •  •  A  (jj 
Jn\ 

T  uj  A  (Cxf  A  uj  A  •  •  •  A  lu 
+  --  -+  cjA---AccA  (CxfW)  . 

Since  we  have  already  shown  (Proposition  21.9)  that  Cxfu  =  0,  we  see 
that  Cxf  A  =  0.  ■ 


21.3  Exercises 

1.  Show  that  the  canonical  2-form  cj  on  T*M  is  nondegenerate. 
Hint :  Work  in  standard  coordinates  {xj,pj}. 
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2.  Show  that  if  <f>  :  M  M  is  a  diffeomorphism,  then  the  induced  map 
<h*  :  T*M  — )►  T*M  is  a  symplectomorphism. 


3. 


Using  Proposition  21.7  and  the  Jacobi  identity  for  the  Poisson  bracket, 
verify  that 


[Xf,Xg] 


=  X{f,g} 


for  all  smooth  functions  /  and  g  on  N. 


4.  If  N  is  compact,  show  that 


/  {f,g}  a  =  o 

J  N 

for  all  smooth  function  /  and  g  on  N. 

Hint :  Apply  Liouville’s  theorem  to  the  flow 


22 

Geometric  Quantization  on  Euclidean 
Space 


22.1  Introduction 

In  this  chapter,  we  consider  the  geometric  quantization  program  in  the 
setting  of  the  symplectic  manifold  M2n,  with  the  canonical  2-form  ca  = 
dpj  A  dxj.  We  begin  with  the  “prequantum”  Hilbert  space  L2(M2n)  and 
define  “prequantum”  operators  Qpre(/)-  These  operators  satisfy 

Qpre({f,g})  =  -r^\Qpie(f),Qpre{g)} 

for  all  f  and  g.  Nevertheless,  there  are  several  undesirable  aspects  to  the 
prequantization  map  that  make  it  physically  unreasonable  to  interpret  it 
as  “quantization.”  To  obtain  the  quantum  Hilbert  space,  we  reduce  the 
number  of  variables  from  2 n  to  n.  Depending  on  how  we  do  this  reduction, 
we  will  obtain  either  the  position  Hilbert  space,  the  momentum  Hilbert 
space,  or  the  Segal-Bargmann  space.  Each  of  these  subspaces  is  preserved 
by  the  prequantized  position  and  momentum  operators,  and  by  certain 
other  operators  of  the  form  Qpre(/)- 

Although  the  material  in  this  chapter  is  a  special  case  of  what  we  do  in 
Chap.  23,  doing  this  case  first  allows  us  to  get  a  feeling  for  the  methods  and 
results  of  geometric  quantization  quickly,  without  needing  to  develop  the 
full  machinery  of  line  bundles,  connections,  and  polarizations  over  general 
symplectic  manifolds.  In  any  case,  we  would  need  to  carry  out  most  of  the 
calculations  in  this  chapter  eventually,  as  standard  examples  of  the  general 
theory. 
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Although  this  chapter  does  not  require  the  full  machinery  of  symplectic 
manifolds,  we  will  make  use  of  the  notions  of  1-forms  and  2-forms  on  M2n, 
along  with  the  notion  of  the  differential  of  a  1-form.  In  particular,  the 
expression  (21.6)  for  the  differential  of  a  1-form  will  be  used. 

The  reader  should  be  warned  that  sign  conventions  in  geometric  quan¬ 
tization  are  not  consistent  from  author  to  author.  The  sign  conventions 
used  here  are  chosen  to  maintain  consistency  with  the  physics  literature. 
In  particular,  we  could  eliminate  an  annoying  minus  sign  in  the  definition 
of  the  holomorphic  subspace  if  we  were  willing  to  allow  the  function  p3  to 
quantize  to  ih  d/dxj.  Since,  however,  the  convention  P3  —  —ih  d/dx3  is 
universal  in  the  physics  literature,  we  have  chosen  to  be  consistent  with 
that  convention  and  to  accept  some  slightly  inconvenient  sign  choices  else¬ 
where.  We  continue  to  follow  the  summation  convention ,  in  which  repeated 
indices  are  always  summed  on. 


22.2  Prequantization 

Ideally,  a  quantization  procedure  Q,  mapping  functions  on  a  symplectic 
manifold  N  to  operators  on  some  Hilbert  space  H,  should  satisfy  the 
following  properties.  First,  Q(f)  should  be  self-adjoint  whenever  /  is  real 
valued.  Second,  we  should  have  Q{  1)  =  /,  where  1  is  the  constant  function. 
Third,  Q({/,  g})  should  be  equal  to  [Q(/),  Q(g)\/(ih).  Fourth,  there  should 
be  some  sort  of  “smallness”  assumption.  In  the  case  N  =  M2n,  for  exam¬ 
ple,  we  may  require  that  H  should  be  irreducible  under  the  action  of  the 
(exponentiated)  position  and  momentum  operators.  (See  Definition  14.6.) 
Although  Groenewold’s  theorem  (Theorem  13.13)  suggests  that  it  is  unre¬ 
alistic  to  expect  to  find  a  quantization  procedure  that  satisfies  all  of  these 
properties  exactly,  we  try  to  come  as  close  as  possible. 

Throughout  this  chapter,  we  follow  the  convention  of  thinking  of  a  “vec¬ 
tor  field”  on  as  a  first-order  differential  operator,  as  in  Exercise  14  in 
Chap.  2.  Given,  for  example,  the  vector- valued  function 


A  =  (2xi  +  x2,x1x2) 


on  M2,  we  identify  A  with  the  operator  of  “differentiation  in  the  direction 
of  A,”  that  is,  with  the  following  first-order  differential  operator: 

d  d 

X  (2xi  +X2)~ - \~XxX2~ — . 

OX\  OX 2 


In  particular,  given  a  smooth  function  /  on  M2n,  the  Hamiltonian  vector 
field  A f  associated  to  /  is  thought  of  as  a  differential  operator: 


df  d  df  d 

dxj  dpj  dpj  dxj 


xf  =  {/,  •} 


(22.1) 
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acting  on  C°°(M2n).  (Compare  Proposition  21.7.)  By  Proposition  21.11,  the 
commutator  (as  differential  operators)  of  two  Hamiltonian  vector  fields  Xf 
and  Xg  is  X^fgy.  Thus,  the  operators  ihXf  satisfy  the  desired  commutation 
relations: 


[ihXf,ihXg]  =  (ihfXUM  =  (: ih)(ihX{fM ). 

It  is  tempting,  then,  to  define  a  (pre) quantization  map  simply  by  tak¬ 
ing  Q(f)  =  ihXf ,  viewed  as  a  self-adjoint  operator  on  the  Hilbert  space 
L2(R2n).  This  map,  however,  does  not  satisfy  Q(l)  =  /.  If  we  to  correct 
our  definition  to  Q(f)  =  ihXf  +  /,  where  /  means  the  operator  of  mul¬ 
tiplication  by  /,  then  Q{  1)  =  I  but  the  desired  commutation  property  is 
destroyed. 

It  is  possible  to  achieve  both  Q(  1)  =  I  and  the  desired  commutation 
relations  by  adding  one  more  term  as  follows.  If  uj  =  dpj  A  dxj  is  the 
canonical  2-form  on  M2n,  let  0  be  any  symplectic  potential  for  c u,  that  is, 
any  one- form  with 

dO  =  uj.  (22.2) 


(We  may,  e.g.,  take  6  =  pjdxj.)  For  a  smooth  function  /  on  M2n,  define  an 
operator  Qpre(/),  acting  on  C°°(M2n),  by 


Qpre(f)  =  ih  (xf  -  +  /•  (22.3) 

The  expression  /  on  the  right-hand  side  of  (22.3)  means,  more  precisely, 
the  operator  of  multiplication  by  /,  and  similarly  for  the  function  0(Xf). 
Note  that  since  0  is  a  1-form  and  Xf  is  a  vector  field,  6(Xf)  is  a  function  on 
M2n.  The  operator  Qpre(f)  is  the  prequantization  of  /  and  is  to  be  viewed 
as  an  unbounded  operator  on  L  2(M2n),  where  we  refer  to  L2(M2n)  as  the 
prequantum  Hilbert  space. 

According  to  Exercise  1 ,  any  divergence  free  vector  field  on  RN  is  a  skew- 
symmetric  operator  on  C^°  (Rn)  C  L2(Rn).  Meanwhile,  each  Hamiltonian 
vector  field  is  divergence  free,  as  we  have  already  remarked  in  the  proof 
of  Liouville’s  theorem  (Theorem  2.27).  Thus,  for  any  smooth,  real- valued 
function  /  on  M2n,  the  operator  Qpre(f)  is  at  least  symmetric.  It  can  be 
shown  that  if  Xf  is  complete,  meaning  that  the  associated  Hamiltonian  flow 
is  defined  for  all  times,  then  Qpre(f)  is  actually  self-adjoint  on  a  natural 
domain.  (See  the  discussion  following  the  proof  of  Proposition  23.13.) 

As  it  turns  out,  the  6(Xf)  term  in  (22.3)  is  precisely  what  is  needed  to 
restore  the  desired  commutation  relations,  while  still  allowing  Qpre(l)  to 
equal  the  identity. 

Proposition  22.  1  For  all  f,g  G  C°°(M2"),  we  have 

^  [<9pre(/),epre(s0]  =  <2pre({/,  S'}), 

where  the  identity  is  to  be  understood  as  an  equality  of  operators  on  C°° 

(R2n). 
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Before  proving  this  result,  it  is  useful  to  understand  the  behavior  of  the 
expression  Xf  —  (i/h)6(Xf)  occurring  in  the  definition  of  Qpre(/)- 

Definition  22.2  For  any  symplectic  potential  6  and  vector  field  X  on  M2n, 
let  Xx  denote  the  covariant  derivative  operator,  acting  on  C°°(M2n), 
given  by 

Vx  =  X  -U(X).  (22.4) 

h 

Note  that  our  prequantized  operators  can  be  written  as 


Qpre(f)  =  iKVXf  +  /■ 

Proposition  22.3  For  any  symplectic  potential  0,  let  X x  denote  the 
associated  covariant  derivative  in  (22.  J^).  Then  for  all  smooth  vector  fields 
X  and  Y  on  M2n,  we  have 

[Vx,  Vy]  =  V [x,Y]  -  jY>(X,Y).  (22.5) 

In  particular,  if  X  =  Xf  and  Y  =  Xg,  we  have 


Vxj,  VxJ  -  Vx(f  ol  +  -{/>4 


if  ,9} 


h 


According  to  standard  differential  geometric  definitions,  the  2-form  uj/h 
on  the  right-hand  side  of  (22.5)  is  the  curvature  of  the  covariant  derivative 
V.  For  our  purposes,  the  fact  that  Vx/5  X xg  in  not  simply  X x{f  g}  is  an 
advantage.  The  extra  term  in  the  formula  for  the  commutator  is  just  what 
we  need  to  compensate  for  the  failure  of  the  operators  ihXf  +  /  to  have 
the  desired  commutation  relations. 

Proof.  Using  the  easily  verified  identity  [Vx,  /]  =  X(f),  we  obtain 


[Vx,Vy]-V[x.y]  =  -l-[X(6(Y))  -Y(d(X))  -  9([X,Y})}. 

In  light  of  (21.6),  the  right-hand  side  becomes  —(i/h)(d0)(X,Y),  where 
dO  =  00.  m 

We  may  now  easily  prove  Proposition  22.1. 

Proof  of  Proposition  22.1.  Using  Proposition  22.3,  we  obtain 


—  [iKVxf  +  f,  ihWXg  +  9] 

=  ( ih )  (V*{/,3}  +  ^{/W  +  Xf{g)  -  Xg(f) 

=  ihVxUtg}  -  {/,  9}  +  {/,  9}  +  {/,  g}, 


which  reduces  to  what  we  want.  ■ 
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Example  22.4  If  0  =  pjdxj ,  the  prequantized  position  and  momentum 
operators  are  given  by 


Q  pre  (Xj  ) 


Qpre  ( Pj  ) 


Xj  +  ih 


_d_ 

dPj 


d 

dxj' 


These  operators  are  essentially  self-adjoint  on  Cf°  )  and  their 
self-adjoint  extensions  satisfy  the  exponentiated  commutation  relations  of 
Definition  lj.2. 


Proof.  We  compute  that  XXj  =  d/dpj  and  that  6(XXj)  =  0,  giving  the 
indicated  expression  for  Qpre(xj).  Meanwhile,  XPj  =  —d/dxj  and  6(XPj)  = 
— Pj .  There  is  a  cancellation  of  the  6{XPj)  term  in  the  definition  of  Qpre(Pj ) 
with  the  pj  term,  leaving  Qpre(Pj )  =  ihXPj. 

The  essential  self-adjointness  of  the  operators  follows  from  Proposition 
9.40.  To  verify  the  exponentiated  commutation  relations,  we  calculate  the 
associated  one-parameter  unitary  groups  as 

(e*tQPre(xj)-0)(x,  p)  =  eltXjfj(-K,p  -thej) 

(e^Qpr e(Pj)^)(x,  pj  =  thej,  p),  (22.6) 

where  we  now  let  QpVe{xj)  and  Qpre(Pj )  denote  the  unique  self-adjoint 
extensions  of  the  given  operators  on  C^°(M2n).  (Compare  Proposition  13.5.) 
The  exponentiated  commutation  relations  can  now  be  easily  verified  by 
direct  calculation.  ■ 

As  we  have  presented  things  so  far,  the  concept  of  covariant  derivative, 
and  thus  also  of  prequantization,  depends  on  the  choice  of  symplectic  po¬ 
tential  6.  This  dependence  is,  however,  illusory;  we  will  now  show  that  the 
prequantum  maps  obtained  with  two  different  symplectic  potentials  are 
unitarily  equivalent. 


Proposition  22.5  Suppose  that  0\  and  0 2  are  two  different  symplectic  po¬ 
tentials  for  the  canonical  2-form  co,  so  that  d (01  —02)  =  0.  Let  the  associated 
covariant  derivatives  be  denoted  by  V1  and  V2  .  Choose  a  real-valued  func¬ 
tion  7  so  that  dy  =  01  —  02  and  let  U1  be  the  unitary  map  of  L2(M2n)  to 
itself  given  by 

=  e“*7/y. 

Then  for  every  vector  field  X ,  we  have 

WV^Uj1  =  V2X.  (22.7) 

IfQJPre(f)i  J  =  1?2,  are  the  associated  prequantization  maps ,  it  follows  that 

WQPifWp  =  Qpre(/)-  (22.8) 

The  map  U1  is  called  a  gauge  transformation. 
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Proof.  The  operation  of  multiplication  by  01(X)  commutes  with 
multiplication  by  whereas 

X(e*7/V)  =  e^/nX^  +  l-e^/hX{-y)xl;. 

/  L 

Since  X(y)  =  (dy)(X)  =  01(X)  —  02(X),  we  obtain 


Vt(e: 


=  Wn  (x+l-X(7)-^(Xf)^ 
=  e^/h  (x  -l-e2(xfyjip 

=  e^jhX2xi). 


Multiplying  both  sides  of  this  equality  by  e~%1'n  gives  (22.7).  Equation 
(22.8)  follows  by  observing  that  multiplication  by  /  commutes  with  multi¬ 
plication  by  .  ■ 


22.3  Problems  with  Prequantization 

Given  the  naturalness  of  the  prequantization  construction,  it  is  tempting 
to  think  that  prequantization  could  actually  be  considered  as  quantization. 
Why  not  take  our  Hilbert  space  to  be  L2(M2n)  and  the  quantized  operators 
to  be  QPre(/)?  To  answer  this  question,  we  now  examine  some  undesirable 
properties  of  prequantization. 

In  the  first  place,  the  Hilbert  space  L2(M2n)  is  very  far  from  irreducible 
under  the  action  of  the  quantized  position  and  momentum  operators,  in 
contrast  to  the  ordinary  Schrodinger  Hilbert  space  L2(Mn),  which  is  irre¬ 
ducible,  by  Proposition  14.7.  Indeed,  in  Sect.  22.4,  we  will  construct  a  large 
family  of  invariant  subspaces.  (See  Proposition  22.13.) 

In  the  second  place,  the  prequantization  map  is  very  far  from  being  mul¬ 
tiplicative.  Of  course,  since  quantum  operators  do  not  commute,  we  cannot 
expect  any  quantization  scheme  Q  to  satisfy  Q(fg )  =  Q(f)Q(g )  for  all  f 
and  g.  Nevertheless,  the  standard  quantization  schemes  we  have  considered 
in  Chap.  13  do  satisfy  this  relation  for  certain  classes  of  observables  /  and 
g.  In  the  Weyl  quantization,  for  example,  we  have  multiplicativity  if  /  and 
g  are  both  functions  of  x  only,  independent  of  p  (or  functions  of  p,  inde¬ 
pendent  of  x).  For  the  prequantization  map,  however,  we  almost  never  have 
multiplicativity,  for  the  simple  reason  that  Qpre(fg )  is  a  first-order  differ¬ 
ential  operator,  whereas  QpTe(f)Qpre(g )  is  second-order,  provided  there  is 
at  least  one  point  where  Xf  and  Xg  are  both  nonzero. 

In  the  third  place,  the  prequantization  map  badly  fails  to  map  positive 
functions  to  positive  operators.  Although  most  of  the  quantization  schemes 
in  Chap.  13  do  not  always  map  positive  functions  to  positive  operators,  they 
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somehow  come  close  to  doing  so.  Indeed,  Qweyh  Q Wick,  and  Qanti-Wick 
all  map  the  harmonic  oscillator  Hamiltonian  to  a  non-negative  operator, 
since  a* a  +  (1/2)7,  a*a,  and  aa *  are  all  non-negative.  (See  Exercise  4  in 
Chap.  13.)  By  contrast,  the  prequantized  harmonic  oscillator  Hamiltonian 
has  spectrum  that  is  unbounded  below,  as  we  now  demonstrate. 


Proposition  22.6  Consider  a  harmonic  oscillator  Hamiltonian  of  the 
form 

l 

H(x,p)  =  - —  (p2  +  ( mcvx )2)  . 

Then  for  each  integer  n,  the  number  nhw  is  an  eigenvalue  for  Qpre(i7). 

Note  that  n  in  the  proposition  is  allowed  to  be  negative,  so  that  the 
spectrum  of  Qpre(H)  is  not  even  bounded  below.  On  the  other  hand,  in 
Sect.  22.5,  we  will  consider  a  certain  closed  subspace  Ha  of  the  prequantum 
Hilbert  space,  which  is  one  candidate  for  the  quantum  Hilbert  space.  For 
appropriate  choice  of  a,  the  space  Ha  is  invariant  under  Qpre(i7 )  and  the 
restriction  of  Qpre(77)  is  self-adjoint  with  spectrum  nhuj ,  where  n  ranges 
over  the  non-negative  integers.  See  Proposition  22.14.  And  finally,  when 
we  introduce  half- forms  in  Sect.  23.7,  we  will  finally  restore  the  spectrum 
(n  +  l/2)huj,  where  n  ranges  over  the  non-negative  integers,  that  we  found 
in  Chap.  11. 

Proof.  We  can  write  H  as 

H(x,p)  =  ^—(p2  +  y2), 

where  y  =  mujx.  The  flow  associated  to  this  Hamiltonian  consists  of  rota¬ 
tions  in  the  (p,p)- plane.  If  we  choose  our  symplectic  potential  to  be 


x  dp) 


1 

Zrnuo 


ip  dy-y  dp), 


then  the  6(Xh)  term  in  Qpre(77)  cancels  with  the  H  term,  leaving 


QPr  e(H)  —  ihXtf 

(  2  d  p  d  \ 

=  in  rriuo  x— - — — 

\  op  m  ox  J 

=  (yf-  -pf~)  ■ 

\  op  oy  J 

Now,  if  <f  denotes  the  angular  variable  for  polar  coordinates  in  the  (p,p)- 
plane,  then  y  d/dp  —  p  d/dy  is  just  d/d<f.  Thus,  we  can  find  eigenvectors 
for  Qpre(H)  of  the  form 


i(r,())=/(r)e  m 


474 


22.  Geometric  Quantization  on  Euclidean  Space 


where  n  is  an  integer  and  /  is  an  arbitrary  function  with  J0°°  |/(r)|2  r  dr<  oo. 

■ 

The  conclusion  of  the  matter  is  that  it  is  not  physically  reasonable  to 
use  prequantization  as  our  quantization  scheme.  Instead,  we  will  pass  to 
a  “smaller”  Hilbert  space  on  which  the  position  and  momentum  operators 
act  irreducibly. 


22.4  Quantization 

To  obtain  a  Hilbert  space  that  can  be  thought  of  as  giving  us  a  “quanti¬ 
zation”  (as  opposed  to  a  prequantization)  of  M2n,  we  restrict  ourselves  to 
a  subspace  of  the  prequantum  Hilbert  space.  The  idea  is  that  we  should 
be  using  only  half  of  the  variables  on  M2n.  We  might,  for  example,  restrict 
ourselves  to  functions  that  depend  only  on  the  position  variables  and  are 
independent  of  the  momentum  variables.  Now,  the  space  of  functions  0  that 
are,  say,  independent  of  p  in  the  ordinary  sense  (i.e. ,  0(x,  p)  =  0(x,  p7)) 
is  not  invariant  under  gauge  transformations  (the  maps  C/7  in  Proposi¬ 
tion  22.5).  The  gauge-invariant  notion  of  being  independent  of  p  is  that 
the  covariant  derivatives  of  0  should  be  zero  in  the  p-directions.  Similarly, 
we  may  consider  spaces  of  functions  with  covariant  derivatives  that  are  are 
zero  in  some  other  set  of  n  directions. 

Definition  22.7  Fix  a  symplectic  potential  0.  Define  the  position  sub¬ 
space  as  the  subspace  of  C°°  (M2n)  consisting  of  functions  for  which 

^ d / dp j 0  0 

for  all  j.  Similarly,  define  the  momentum  subspace  as  the  subspace  ofC°° 
(M2n)  consisting  of  functions  0  for  which 

^ d/dxj  0 

for  all  j.  Finally,  define  the  holomorphic  subspace  with  parameter  a  to 
be  the  subspace  of  C°°  (M2n)  consisting  of  functions  0  for  which 

^ d /  dzj  0  0 

for  all  j ,  where  Zj  =  Xj  —  iapj  and  where  d/dzj  and  d/dzj  are  defined  by 

d  1  f  d 

dzj  2  y  dxj 

The  operators  d/dzj  and  d/dzj  are  nothing  but  the  usual  complex  deriva¬ 
tive  operators  on  Cn  written  in  terms  of  the  variables  x  and  p,  where  we 
identify  M2n  with  Cn  by  the  map  (x,  p)  i— >>  x  —  ia p. 

Of  course,  the  exact  form  of  the  various  subspaces  in  Definition  22.7 
depends  on  the  choice  of  symplectic  potential.  It  is  convenient  to  use  the 
symplectic  potential  6  =  pj  dxj . 


d 


d 


a  dpj 


dz , 


d 


d 


Ox, 


a  dp 


(22.9) 
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Proposition  22.8  Take  the  symplectic  potential  6  =  pj  dxj.  Then  the 
position ,  momentum ,  and  holomorphic  subspaces  may  be  computed  as  fol¬ 
lows.  The  position  subspace  consists  of  smooth  functions  if  on  M2n  of  the 
form 

V’fop)  = 

where  <f  is  an  arbitrary  smooth  function  on  Mn.  The  momentum  subspace 
consists  of  smooth  functions  if  of  the  form 

^(x,p)=eix-p/^(p),  (22.10) 

where  <f  is  an  arbitrary  smooth  function  on  Mn.  Finally,  the  holomorphic 
subspace  consists  of  functions  of  the  form 

V>(x,  p)  =  F(zi, zn)e~a^2  ^2h\  (22.11) 

where  F  is  an  arbitrary  holomorphic  function  on  Cn  and  where  Zj  =  Xj  — 
iapj . 

Proof.  Since  6(8/ dp j)  =  0,  we  have  Vg/dPj  =  d/dpj ,  so  that  functions 
that  are  covariantly  constant  in  the  p-directions  are  actually  constant  in 
the  p-directions.  Meanwhile,  6(8 /dxj)  =  Pj  and  so 

d  i 

Vd/dxj  =  xrj~  nPj ■ 

Now,  any  function  if  on  M2n  can  be  written  in  the  form  e'lx'p^hf(it,  p)  for 
some  other  function  <f.  If  we  use  this  form  to  compute  V d/dpj'lPi  there  is  a 
convenient  cancellation,  giving 

(V9/a*y)(x,P)  =  elX'P/h^~- 

Thus,  V g/dxjif  =  0  for  all  j  if  and  only  if  <fi  is  independent  of  x. 

Finally,  we  note  that  6(8 /dzj)  =  pjj 2,  so  that 

d  i 

Vd/dE>  =  2 hPj' 

Any  function  if  on  M2n  can  be  written  in  the  form  if  (it,  p)  =  e-Q!lpl  F2h)  F 
for  some  other  function  F,  where  we  note  that 


-a|p|2  /  (2  h) 


exP  XA  ~zF/( 8ah) 


Thus, 


d_ 

dzj 


ck  |  p  | 2  /  (2h) 


_ fd_e-a\p\2 /(2h) 

4  ah 


i 


2  h 


Pje 


cn|p|2  /  (2  h) 
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When  we  compute  using  the  indicated  form,  there  is  another 

convenient  cancellation,  giving 


(Va/a2jV)(x,  p)  =  e 


_  e-a|p|2/(2?i) 


dF 

dzj 


Thus,  Vo/dzj^  =  0  for  all  j  if  and  only  if  F  is  holomorphic  as  a  function 
of  the  variables  Zj  =  x3  —  iapj .  ■ 

From  the  physical  standpoint,  we  do  not  merely  want  a  vector  space  of 
functions,  but  a  Hilbert  space.  It  is  natural,  then,  to  look  at  functions  of  the 
forms  computed  in  Proposition  22.8  that  belong  to  L  2(M2n).  In  the  case  of 
the  position  and  momentum  subspaces,  we  encounter  a  serious  problem: 
There  are  no  nonzero  functions  of  the  indicated  form  that  are  square  inte¬ 
grate  over  M2n.  After  all,  if  if  is  in  the  position  subspace,  then  ^(x,  p)  is 
independent  of  p  and  the  integral  of  \if\  over  the  p- variables  will  be  infi¬ 
nite,  unless  if  is  zero  almost  everywhere.  If  if  is  in  the  momentum  subspace, 
gb  | 2  is  independent  of  x  and  we  have  a  similar  problem. 

The  solution  to  this  problem  is  to  integrate  not  over  M2n  but  over  Mn. 
Although  the  “proper”  way  to  make  this  change  of  integration  is  to  intro¬ 
duce  the  notion  of  “half-forms,”  as  in  Chap.  23,  we  will  content  ourselves 
in  this  chapter  with  the  following  simplistic  rule:  integrate  only  over  the 
variables  on  which  \if\2  depends.  If  we  want  to  get  a  Hilbert  space  (not  just 
an  inner  product  space),  we  must  also  allow  functions  of  the  specified  form 
that  are  square  integrable  but  not  necessarily  smooth.  We  may  therefore 
identify  the  position  Hilbert  space  and  momentum  Hilbert  space  as  follows. 

Conclusion  22.9  The  position  Hilbert  space  is  the  space  of  functions  on 
M2n  of  the  form 

V^(x,p)  =  0(x), 

where  <f  E  L2(Mn).  The  norm  of  such  a  function  is  computed  as 


/  |</>(x)|2  dx. 


The  momentum  Hilbert  space  is  the  space  of  functions  on  M2n  of  the  form 

Ip(x,p)  =  elx'p/h(j)(  p), 

where  <f>  E  L2(Rn).  The  norm  of  such  a  function  is  computed  as 


if 


<Hp)I  dP- 


If  we  consider  the  holomorphic  subspace,  we  find  that  it  behaves  better 
than  the  position  and  momentum  subspaces,  in  that  there  exist  nonzero 
functions  of  the  form  (22.11)  that  are  square  integrable  over  M2n,  as  we 
will  see  shortly.  Furthermore,  the  space  of  functions  of  the  form  (22.11) 
that  are  square  integrable  over  M2n  form  a  closed  subspace  of  L2(M2n),  by 
the  same  argument  as  in  the  proof  of  Proposition  14.15. 
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Conclusion  22.10  The  holomorphic  Hilbert  space  consists  of  those 
functions  pj  of  the  form  (22.11)  that  are  square  integrable  over  M2n.  If  ip 
is  identified  with  the  holomorphic  function  F  in  (22.11),  then  this  Hilbert 
space  may  be  identified  with  TLL2(Cn  where 

v{z)  =  e“|Imz|  2/M0. 

The  space  TLL2(Cn  ,v)  is  nothing  but  an  invariant  form  of  the  Segal- 
Bargmann  space  (Definition  14.14),  where  here  “invariant”  means  that 
the  density  v  is  invariant  under  translations  in  the  real  directions.  This 
space  can  be  identified  unitarily  with  the  ordinary  Segal-Bargmann  space 
TtL2  ()Cn ,  p2ah)  as  follows.  Define  a  map  4/  :  77L2(Cn,  p2ah)  HL2 
(Cn,v)  by 

T(F)(z)  =  (27ro^)-n/2e-z2/(4^)F(z),  (22.12) 

where  z2  =  z\  +  •  •  •  +  z2.  Then  a  simple  calculation  shows  that 


T(F) 


2 

L2(0,Q 


T(z)|2/i2a^(z)  dz. 


Since  also  e  z  /(4ah^  is  holomorphic  as  a  function  of  z,  we  see  that  4/  maps 
TLL2(Cn,  p2ah)  isometrically  into  TtL2(Cn  ,n).  The  map  4/  has  an  inverse 
given  by  multiplication  by  (2tt ah)n/2ez  /(4q;^)?  showing  that  4/  is  actually 
unitary.  In  particular,  there  exist  many  nonzero  holomorphic  functions  on 
Cn  that  belong  to  TLL2(Cn,  v). 

We  will  regard  any  of  the  Hilbert  spaces  in  Conclusions  22.9  and  22.10 
as  our  quantum  Hilbert  space.  These  spaces  are  to  be  compared  to  the  pre- 
quantum  Hilbert  space  L2(M2n),  which  is  in  some  sense  “bigger,”  consisting 
of  functions  of  twice  as  many  variables.  Note  there  are  multiple  possibili¬ 
ties  for  the  quantum  Hilbert  space.  To  reduce  from  the  prequantum  Hilbert 
space  to  the  quantum  Hilbert  space,  we  have  to  choose  a  set  of  n  variables, 
and  then  we  look  a  functions  that  depend  only  on  those  n  variables.  In¬ 
deed,  there  are  many  other  possibilities  for  the  quantum  Hilbert  space;  we 
have  considered  only  the  most  common  choices.  We  defer  a  discussion  of 
the  general  theory  until  Chap.  23. 

The  reader  may  wonder  why  we  are  using  the  definition  Zj  =  Xj  —  iapj 
( a  >  0)  rather  than  Zj  =  Xj+iapj.  If  we  repeated  the  preceding  calculations 
with  Zj  =  Xj  +  iapj ,  with  a  corresponding  sign  change  in  the  definition  of 
d/dzj ,  we  would  find  that  satisfies  'Vd/dz31P  f°r  all  3  if  and  only  if  is 
of  the  form 

V>(x>  P)  =  Fizu  •  ■  • ,  ^„)ea|p|2/(2fi'),  (22.13) 

where  F  is  holomorphic  on  Cn .  The  change  in  sign  in  the  exponent  between 
(22.11)  and  (22.13)  has  a  drastic  effect:  There  are  no  nonzero  holomorphic 
functions  F  for  which  the  function  ip  in  (22.13)  is  square  integrable  over 
M2n.  (See  Exercise  3.)  Unlike  the  situation  with  the  position  and  momentum 
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Hilbert  spaces,  there  is  no  natural  way  to  alter  the  domain  of  integration 
to  make  a  function  of  the  form  (22.13)  have  finite  norm. 

We  see,  then,  that  there  is  a  big  difference  between  the  definitions  Zj  = 
Xj  —  iapj  and  Zj  =  Xj  +  iapj.  In  the  general  framework  of  geometric 
quantization,  we  will  have  a  similar  distinction,  where  complex  structures 
satisfying  a  certain  positivity  condition  behave  well,  whereas  the  “opposite” 
complex  structures  behave  badly.  (See  Definition  23.19  in  Sect.  23.4.) 


22.5  Quantization  of  Observables 


Now  that  we  have  constructed  our  quantum  (as  opposed  to  prequantum) 
Hilbert  spaces,  we  need  to  construct  operators  on  these  spaces.  According 
to  the  standard  geometric  quantization  program,  the  quantum  operator 
associated  with  a  function  /  is  supposed  to  be  simply  the  restriction  to  the 
quantum  Hilbert  space  of  the  prequantum  operator  Qpre(/),  provided  that 
Qpre(f)  leaves  the  quantum  Hilbert  space  invariant. 

Proposition  22.11  The  position,  momentum,  and  holomorphic  subspaces 
in  Definition  22.7  are  all  invariant  under  the  prequantum  operators  Qpie{xj) 
and  QPre(Pj )•  Specifically,  in  the  position  subspace,  we  have 


Qprefaj  )0(^) 
Qpre(Pj  )0(x) 

in  the  momentum  subspace,  we  have 


Xj(p(x) 

d(j) 


ih 


dx . 


Qpre(^)(eix'P/V(p))  =  (^(P)) 

QprefeXe^P/^p))  =  eixp/h(Pj<P(  p)), 


and  in  the  holomorphic  subspace,  we  have 

Qwe(xj)(F(z)e-a^2/(2h))  =  ^ah^+ZjF^  e-“lPl2/(2 *) 

Qpre(Pj)(F( z)e-“|p|2/(M))  =  (-*^)  e-«|pl2/(2R). 

Proof.  See  Exercise  4.  ■ 

The  invariance  of  the  three  subspaces  under  the  prequantized  position 
and  momentum  operators  follows  from  a  general  result  in  geometric  quanti¬ 
zation,  that  for  a  real- valued  function  /,  the  prequantum  operator  QpTe(f) 
preserves  a  given  quantum  space  if  and  only  if  the  Hamiltonian  flow  gen¬ 
erated  by  /  preserves  the  polarization  defining  the  quantum  space.  The 
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term  “polarization”  refers  to  the  set  of  directions  in  which  the  elements  of 
the  quantum  space  are  covariantly  constant.  In  the  case  of  the  position, 
momentum,  and  holomorphic  spaces,  the  set  of  such  directions  is  the  same 
at  every  point,  which  means  that  the  polarization  is  invariant  under  trans¬ 
lations.  But  the  Hamiltonian  flows  generated  by  Xj  and  pj  are  nothing 
but  translations  in  the  —pj -directions  and  the  Xj -direct ions,  respectively. 
Of  course,  in  this  simple  example,  we  can  verify  the  invariance  by  direct 
computation,  which  also  gives  the  indicated  form  of  the  operators  on  each 
subspace. 

Note  also  that  in  each  case,  the  “preferred”  functions  act  simply  as  mul¬ 
tiplication  operators.  In  the  position  subspace,  for  example,  the  position 
operator  Qpve(xj)  acts  simply  as  multiplication  by  Xj,  whereas  in  the  mo¬ 
mentum  subspace,  the  operator  Qpre(Pj)  acts  as  multiplication  by  pj.  Fi¬ 
nally,  in  the  holomorphic  subspace,  the  operator 

Qpre(Zj)  (F{ z)e-a'p|2/(2Q  =  (ZjF( Z))e-alp|2/(2y 

where  Zj  =  x3  —  iapj ,  since  the  terms  involving  dF/dzj  cancel. 

We  now  focus  on  the  position  Hilbert  space  and  look  for  operators  of  the 
form  QpTe(f)  that  leave  the  position  subspace  invariant. 

Proposition  22.12  The  position  subspace  is  invariant  under  QpTe(f)  when¬ 
ever  f  is  of  the  form 


/(x,  p)  =  a(x)  +  bj(x.)pj  (22.14) 

for  some  smooth  functions  a  and  fei , . . . ,  bn  on  Mn.  On  the  other  hand ,  the 
position  subspace  in  not  invariant  under  the  operator  Qpre(pj)- 

Proof.  If  /  is  of  the  form  (22.14),  calculation  shows  that  6(Xf)-\-f  =  a(x). 
If  we  drop  any  terms  in  Xf  involving  d/dpj ,  since  these  are  zero  on  the 
position  subspace,  we  end  up  with 

Qpre(/)(0(x))  =  -ihbj(x.)X  +a(x)0(x),  (22.15) 

which  is  again  in  the  position  subspace.  [There  is  no  p-dependence  in  the 
coefficient  of  d/dxj  in  (22.15)  because  df/dpj  is  independent  of  p.]  On 
the  other  hand,  direct  calculation  shows  that  the  restriction  to  the  position 
subspace  of  Qpre(f)  is 

~2inPjdx •  ~p2j ’ 

which  does  not  preserve  the  space  of  functions  on  M2n  that  are  independent 
of  p.  ■ 
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It  should  be  noted  that  the  expression  on  the  right-hand  side  of  (22.15) 
is  not  a  self-adjoint,  or  even  symmetric,  operator  on  L2(Mn),  unless  the 
vector  field  b(x)  happens  to  be  divergence  free.  (Even  though  the  vector 
field  Xf  is  divergence  free  on  M2n,  the  way  Xf  acts  on  functions  that  are 
independent  of  p  is  not  necessarily  a  divergence  free  vector  field  on  Mn.) 
This  undesirable  feature  of  our  quantization  scheme  is  the  result  of  our 
simplistic  method  of  passing  from  L  2(M2n)  to  L2(Mn)  in  our  derivation  of 
Conclusion  22.9.  When  we  do  this  reduction  properly,  using  half- forms,  we 
will  obtain  a  self-adjoint  operator.  See  Sect.  23.6. 

We  now  consider  the  behavior  of  the  holomorphic  subspace  under  the 
prequantized  position  and  momentum  operators. 

Proposition  22.13  For  any  a  >  0,  let  Ha  be  the  subspace  of  L2(M2n) 
consisting  of  smooth  functions  that  satisfy  Xd/dz-'f’  =  0,  where  d/dzj 
is  as  in  (22.9).  Then  Ha  is  a  closed  subspace  of  L2(M2n)  and  Ha  is  in¬ 
variant  under  the  one-parameter  unitary  groups  generated  by  QpYe{xj)  and 
Qpre{Pj )•  Furthermore,  QpYe{xj)  and  QpYe{pj )  act  irreducibly  on  Ha  in  the 
sense  of  Definition  14-6. 

For  each  a  >  0,  the  holomorphic  Hilbert  space  is  a  subspace  of  the 
prequantum  Hilbert  space  invariant  under  the  exponentiated  position  and 
momentum  operators.  Thus,  the  prequantum  Hilbert  space  is  far  from  being 
irreducible  under  the  action  of  those  operators. 

Proof.  The  invariance  of  Ha  is  a  simple  calculation  (Exercise  5). 
Irreducibility  can  be  established  by  reducing  to  the  previously  established 
irreducibility  of  the  Segal-Bargmann  space  under  the  operators  Ta  in  The¬ 
orem  14.16.  To  this  end,  we  should  check  that  the  unitary  map  T  in  (22.12) 
intertwines  products  of  exponentials  of  Qpre(xj )  and  QpYe(Pj)  with  opera¬ 
tors  of  the  form  Ta  (with  h  replaced  by  2 ah).  This  is  a  straightforward  but 
tedious  calculation,  and  we  omit  the  details.  ■ 

We  conclude  this  section  with  an  example  of  a  quantum  subspace  that  is 
invariant  under  the  (pre) quantized  Hamiltonian  of  a  harmonic  oscillator. 

Proposition  22.14  Consider  a  harmonic  oscillator  with  Hamiltonian 

l 

H  =  - (p2  +  (mtvx)2)  . 

Consider  also  the  subspace  Ha  in  Proposition  22.13,  with  a  =  l/(muS). 
Then  the  operator  Qpre(H)  leaves  Ha  invariant.  Furthermore,  the  restric¬ 
tion  of  Qpre(H)  to  Ha  has  non-negative  spectrum  consisting  of  eigenvalues 
of  the  form  nhiw ,  where  n  ranges  over  the  non-negative  integers. 

Proposition  22.14  is  a  much  more  physically  reasonable  result  for  the 
spectrum  of  the  quantization  of  the  non-negative  function  H  than  on  the 
full  prequantum  Hilbert  space,  where  (Proposition  22.6)  the  spectrum  of 
Qpre(H)  is  not  even  bounded  below.  When  we  introduce  the  “half- form 
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correction”  in  Sect.  23.7,  we  will  finally  be  able  to  obtain  the  “correct” 
spectrum  for  the  quantum  harmonic  oscillator,  consisting  of  numbers  of 
the  form  (n  +  l/2)fuj,  n  =  0, 1,  2, ... .  See  Example  23.53. 

Proof.  As  in  the  proof  of  Proposition  22.6,  we  introduce  the  variable 
y  =  mcox.  With  a  =  l/(mce),  this  gives  z  =  (y  —  ip)/(muj).  We  use  the 
symplectic  potential 


x  dp) 


1 

2rriuo 


ip  dy-y  dp). 


Then 


1 

2 


p  H - x 

a 


i 


2a 


z 


and  so  'S/o/dz  =  d/dz  +  z/(2ah).  From  this,  we  can  easily  check  that  the 
holomorphic  subspace  consists  of  functions  of  the  form 


F( Z)e-\z\2/(z<*n)  =  F(z)  exp 


(y2+P2)} 

2mujh  j  ’ 


(22.16) 


where  F  is  holomorphic. 

Meanwhile,  as  in  the  proof  of  Proposition  22.6,  we  have 


(  d  d\ 

Qpre(H)  =  ihui  \y--p  —  j, 


which  is  just  an  angular  derivative  in  the  (y,p)- plane.  Since  the  exponential 
factor  in  (22.16)  is  rotationally  invariant,  Qpre(H)  only  hits  F.  Meanwhile, 


F 


y 


dF 

dz 


mu 


p 


i  ,  . . dF 

= - ( y~w)~r 

rnuo  dz 


dF  1 

dz  rnuo 


Thus, 

Qpre(H)(F( z)e“|z|2/(2a?i))  =  e-M2/(2°y 

which  is  again  in  the  holomorphic  subspace. 

Finally,  as  in  Proposition  14.15,  the  functions  zn,  n  =  0,1,2,...,  form 
an  orthogonal  basis  for  the  Hilbert  space  Ha.  Each  monomial  zn  is  an 
eigenvector  for  the  operator  z  d/dz  with  eigenvalue  n.  This  establishes  the 
claim  about  the  spectrum  of  the  restriction  to  Ha  of  Qpre(H).  m 

The  operator  F  ^  fuzz  dF / dz  is  self-adjoint  on  the  holomorphic  Hilbert 
space,  in  contrast  to  the  operators  in  (22.15)  in  the  case  of  the  position 
Hilbert  space.  Indeed,  self-adjointness  is  “automatic”  in  this  case,  because 
the  holomorphic  Hilbert  space  is  actually  a  subspace  of  the  prequantum 
Hilbert  space,  and  the  restriction  of  a  self-adjoint  operator  to  an  invariant 
subspace  is  self-adjoint. 
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22.6  Exercises 


1.  Consider  the  vector  field 

on  M2n,  where  the  s  are  smooth,  real- valued  functions.  Show  that 
X  is  skew- self- adjoint  on  C%°(RN)  if  and  only  if  the  divergence  of  X 
(i.e. ,  the  quantity  daj/dxj )  is  identically  zero. 

2.  Using  the  symplectic  potential  6  =  p  dx,  compute  QpVe{xp2).  Show 
that  Qpreixp2)  is  not  in  the  algebra  of  operators  generated  by  Qpre(x) 
and  Qpre(p). 

Hint :  Consider  how  QpTe(%P2)  acts  on  functions  that  are  independent 
of  p. 

3.  (a)  Suppose  F  is  a  holomorphic  function  on  C  such  that 


F(z)  |  dz  <  oo 


where  here  dz  denotes  the  2-dimensional  Lebesgue  measure  on 
C  =  M2.  Show  that  F  is  identically  zero. 

Hint :  If  F  is  not  identically  zero,  use  a  power  series  argument 
to  show  that  the  L2  norm  of  F  over  a  disk  of  radius  R  tends  to 
infinity  as  R  tends  to  infinity. 

(b)  Show  that  if  a  function  of  the  form  (22.13),  with  F  holomorphic 
on  Cn,  is  square  integrable,  then  F  must  be  identically  zero. 


4.  Prove  Proposition  22.11,  using  the  explicit  form  of  Qpre(xj)  and 
Qpre(Pj)  in  Example  22.4. 

Hint :  In  the  case  of  the  holomorphic  subspace,  express  the  operators 
d/dxj  and  d/dpj  in  terms  of  the  operators  d/dzj  and  d/dzj  in  (22.9). 

5.  Show  that  the  space  of  functions  of  the  form  in  (22.11),  where  F  is 
holomorphic  on  Cn,  is  invariant  under  the  operators  elt®pre(Xj)  and 
ettQPre(Pj)  computed  in  (22.6),  for  all  t  E  M  and  j  =  1,  2, . . . ,  n. 
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Geometric  Quantization  on  Manifolds 


23.1  Introduction 

Geometric  quantization  is  a  type  of  quantization ,  which  is  a  general  term 
for  a  procedure  that  associates  a  quantum  system  with  a  given  classical 
system.  In  practical  terms,  if  one  is  trying  to  deduce  what  sort  of  quantum 
system  should  model  a  given  physical  phenomenon,  one  often  begins  by 
observing  the  classical  limit  of  the  system.  Electromagnetic  radiation,  for 
example,  is  describable  on  a  macroscopic  scale  by  Maxwell’s  equations.  On 
a  finer  scale,  quantum  effects  (photons)  become  important.  How  should  one 
determine  the  correct  quantum  theory  of  electromagnetism?  It  seems  that 
the  only  reasonable  way  to  proceed  is  to  “quantize”  Maxwell’s  equations- 
and  then  to  compare  the  resulting  quantum  system  to  experiment. 

Meanwhile,  not  every  physically  interesting  system  has  M2n  as  its  phase 
space.  Geometric  quantization,  then,  is  an  attempt  to  construct  a  quantum 
Hilbert  space,  together  with  appropriate  operators,  starting  from  a  phys¬ 
ical  system  having  an  arbitrary  2n-dimensional  symplectic  manifold  N  as 
its  phase  space.  To  perform  geometric  quantization  on  TV,  one  must  first 
choose  a  polarization,  that  is,  roughly,  a  choice  of  n  directions  on  N  in  which 
the  wave  functions  will  be  constant.  If  N  =  T*M,  then  one  may  use  the 
“vertical  polarization,”  in  which  the  wave  functions  are  constant  along  the 
fibers  of  T*M.  For  cotangent  bundles  with  the  vertical  polarization,  geo¬ 
metric  quantization  reproduces  the  “half-density  quantization”  of  Blattner 
[4].  (See  Examples  23.45  and  23.48.)  Even  for  cotangent  bundles,  however, 
it  is  of  interest  to  use  polarizations  other  than  the  vertical  polarization,  as 
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we  have  seen  already  in  the  Mn  case.  In  the  case  of  the  cotangent  bundle  of 
a  compact  Lie  group,  for  example,  the  paper  [20]  shows  how  quantization 
with  a  complex  polarization  gives  rise  to  a  generalized  Segal-Bargmann 
transform. 

Some  phase  spaces,  meanwhile,  may  not  even  be  in  the  form  of  a  cotan¬ 
gent  bundle.  In  the  orbit  method  in  representation  theory,  for  example, 
the  relevant  symplectic  manifolds  are  “coadjoint  orbits,”  which  typically 
are  not  cotangent  bundles.  [In  the  SU (2)  case,  for  instance,  these  orbits  are 
2-spheres  with  the  natural  rotationally  invariant  symplectic  form.]  In  quan¬ 
tum  field  theory,  meanwhile,  one  encounters  Lagrangians  that  are  linear, 
rather  than  quadratic,  in  the  “velocity”  variables.  In  such  cases,  the  initial 
velocity  is  determined  by  the  initial  position,  and  one  cannot  think  of  the 
space  of  initial  conditions  as  a  (co)tangent  bundle.  Systems  of  this  form  can 
still  be  symplectic,  but  they  are  not  cotangent  bundles.  Furthermore,  it  is 
common  to  think  of  compact  symplectic  manifolds  (such  as  S 2  with  a  ro¬ 
tationally  invariant  symplectic  form)  as  classical  models  of  internal  degrees 
of  freedom,  such  as  spin. 

To  quantize  these  more  general  symplectic  manifolds,  one  needs  a  more 
general  approach  to  quantization.  Given  a  symplectic  manifold  (TV,  uS)  sat¬ 
isfying  a  certain  integrality  condition,  one  can  construct  a  line  bundle  L 
over  N  along  with  a  connection  V  on  L  which  has  a  curvature  of  uj/h. 
One  can  then  define  “prequantum”  operators,  acting  on  sections  of  L,  in 
much  the  same  way  we  did  in  the  Euclidean  case  in  Chap.  22,  and  these 
operators  will  have  the  desired  relationship  between  Poisson  brackets  and 
commutators.  One  then  chooses  a  polarization  on  N  and  defines  the  quan¬ 
tum  Hilbert  space  to  be  the  space  of  sections  that  are  covariantly  constant 
in  the  directions  of  that  polarization.  If  the  Hamiltonian  flow  generated  by 
a  function  /  preserves  the  relevant  polarization,  then  Qpre(f)  will  preserve 
the  quantum  Hilbert  space.  In  the  case  of  real  polarizations,  there  may  fail 
to  be  any  nonzero  square-integrable  sections  that  are  covariantly  constant 
in  the  directions  of  the  polarization,  a  possibility  that  forces  us  to  introduce 
the  machinery  of  “half- forms.” 

Let  us  end  this  introduction  with  a  brief  critique  of  the  framework  of  geo¬ 
metric  quantization.  In  the  first  place,  geometric  quantization  has  too  many 
definitions  (bundles,  connections,  curvature,  polarizations,  half-forms)  and 
too  few  theorems.  In  the  second  place,  the  class  of  functions  that  geometric 
quantization  allows  us  to  quantize — those  functions  for  which  the  associ¬ 
ated  Hamiltonian  flow  preserves  the  polarization — is  often  dishearteningly 
small.  In  the  case  N  =  T*M ,  for  example,  with  the  natural  “vertical” 
polarization,  geometric  quantization  does  not  allow  us  to  quantize  the  ki¬ 
netic  energy  function,  at  least  not  by  the  “standard  procedure”  of  geomet¬ 
ric  quantization.  Nevertheless,  geometric  quantization  is  the  only  game  in 
town  if  one  wants  to  quantize  general  symplectic  manifolds  in  a  way  that 
produces  an  actual  Hilbert  space  and  operators  thereon. 
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This  chapter  lays  out  in  an  orderly  fashion  all  the  ingredients  needed 
to  “do”  geometric  quantization.  Furthermore,  although  this  approach  in¬ 
creases  length,  the  chapter  fills  in  the  details  of  several  arguments  that 
are  only  sketched  in  the  standard  reference  on  the  subject,  the  book  [45]  of 
Woodhouse.  The  presentation  assumes  basic  results  about  symplectic  man¬ 
ifolds  from  Chap.  21.  Besides  the  basic  results  about  manifolds  reviewed  in 
Sect.  21.1,  we  will  make  use  of  the  Frobenius  theorem  (see,  e.g.,  Chap.  19 
of  [29]). 

As  we  have  noted  already  in  the  introduction  to  Chap.  22,  sign  con¬ 
ventions  in  the  subject  of  geometric  quantization  are  not  consistent  from 
author  to  author. 


23.2  Line  Bundles  and  Connections 

In  this  section,  we  develop  the  necessary  machinery  to  extend  the  prequan¬ 
tization  construction  of  Sect.  22.2  to  arbitrary  symplectic  manifolds.  We 
introduce  the  notion  of  a  line  bundle  over  a  manifold  and  sections  thereof, 
which  look  locally  like  complex-valued  functions.  We  then  introduce  the 
notion  of  covariant  derivatives  of  sections  of  a  line  bundle,  where  locally 
these  covariant  derivatives  take  the  form  Vx  =  X  —  iO(X)  for  a  certain 
1-form  0.  We  then  introduce  the  curvature  2-form,  which  is  a  globally  de¬ 
fined,  closed  2-form  that  can  be  computed  locally  as  dO.  We  continue  to 
observe  the  summation  convention,  in  which  repeated  indices  are  always 
summed  on. 

Definition  23.1  If  X  is  a  smooth  manifold ,  a  complex  line  bundle  over 
X  is  a  smooth  manifold  L  together  with  the  following  additional  structures . 
First,  we  have  a  smooth,  surjective  map  tt  :  L  X.  Second,  for  each  x  E  X, 
the  set  7r_1({x})  is  equipped  with  the  structure  of  a  complex  vector  space  of 
dimension  1.  For  each  x  E  N,  the  vector  space  tt~1({x})  is  called  the  fiber 
ofL  over  x. 

These  structures  are  assumed  to  satisfy  the  local  triviality  property, 
namely  that  each  x  E  X  has  a  neighborhood  U  such  that  there  exists  a 
diffeomorphism  x  :  7T~1{U)  U  x  C  with  the  following  properties.  First, 

TT(p)  =  TTi(x(p)), 

where  tti  :  U  x  C  — >  U  is  projection  onto  the  first  factor.  Second,  for  each 
x  G  U,  the  map  p  7T2(x(p))  i s  a  vector  space  isomorphism  of  tt~1({x}) 
with  C. 

A  section  of  a  line  bundle  L  over  X  is  a  map  s  :  X  L  such  that 
7 t(s(p))  =  p  for  all  pGl. 

For  any  manifold  X ,  we  can  form  the  trivial  line  bundle  X  x  C,  where 
7 t(x,  z)  —  z  and  where  the  vector  space  structure  on  {x}  x  C  is  just  the 
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usual  vector  space  structure  on  C.  The  local  triviality  property  for  a  general 
line  bundle  L  means  that  L  “looks”  locally  like  the  trivial  line  bundle. 

Definition  23.2  A  connection  V  on  a  line  bundle  L  over  N  is  a  map 
associating  to  each  vector  field  X  on  N  and  section  s  of  L  another  sec¬ 
tion  Vx(s)  of  L  satisfying  the  following  properties.  First,  for  each  smooth 
function  f  on  N,  we  have 


V/x(s)  =  /Vx(s)  (23.1) 

for  all  vector  fields  X  and  sections  s.  Second,  for  each  smooth  function  f 
on  N,  we  have  the  product  rule 

Xx(fs)  =  (X(f))s  +  fX7x(s)  (23.2) 

for  all  vector  fields  X  and  sections  s. 

Note  that  for  any  section  s  of  L  and  any  function  /  on  N ,  the  quantity 
fs  is  a  section  of  s.  Given  a  connection  V  and  a  vector  field  X ,  the  operator 
Vx  is  called  the  covariant  derivative  in  the  direction  of  X. 

Definition  23.3  A  Hermitian  structure  on  a  line  bundle  L  over  N  is 
a  choice  of  an  inner  product  (•,  •)  on  each  fiber  7r_1({x})  of  L  such  that 
for  each  smooth  section  s  of  L ,  (s,  s)  is  a  smooth  function  on  N.  A  line 
bundle  L  together  with  a  choice  of  a  Hermitian  structure  on  L  will  be  called 
a  Hermitian  line  bundle.  A  connection  V  on  a  Hermitian  line  bundle 
L  is  called  Hermitian  if  for  every  vector  field  on  X ,  we  have 

(Vx(m),  S2)  +  (si,  Vx(^2))  =  X(si,S2)  (23.3) 

for  all  smooth  sections  s  1  and  S2  of  L. 

We  will  let  the  expression  “Hermitian  line  bundle  with  connection”  refer 
to  a  Hermitian  line  bundle  L  together  with  a  Hermitian  connection  on  L; 
that  is,  in  this  expression,  “Hermitian”  applies  both  to  the  bundle  and  to 
the  connection. 

Given  a  Hermitian  line  bundle  L  with  connection,  it  is  always  possible 
to  choose  a  locally  defined  smooth  section  so  near  any  point  such  that 
(so,so)  =  1.  We  call  sq  a  local  isometric  trivialization  of  L.  Any  section 
s  of  L  can  be  written  locally  as  s  =  fso  for  a  unique  complex-valued 
function  /.  Given  a  vector  field  X ,  let  0(X)  be  the  unique  function  such 
that 

Vx(8o)  =  -i0(X)so. 

Using  the  assumption  V/x  =  /Vx,  it  can  be  shown  (Exercise  1)  that  the 
value  of  0(X)  at  a  point  p  depends  only  on  the  value  of  X  at  p.  Thus,  6 
defines  a  1-form  on  N.  Using  the  assumption  that  V  is  Hermitian,  it  can 
be  shown  (Exercise  2)  that  0(X)  is  always  real  valued. 
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Now,  using  the  product  rule  (23.2)  for  covariant  derivatives,  we  have 

Vx(fso)  =  X(f)s0  +  fVx(so) 

=  (X(f)-i0(X)f)so. 

Thus,  if  we  identify  sections  of  L  locally  with  the  coefficient  function  /,  we 
have 

Xx(f)=X(f)-i9(X)f,  (23.4) 

as  in  Sect.  22.2.  We  call  6  the  connection  1-form  associated  to  the  particular 
local  isometric  trivialization. 

Definition  23.4  For  any  Hermitian  line  bundle  (L,V)  with  connection, 
define  the  curvature  2- form  u  of  V  by  requiring  that 

Uj(X,  Y)s  =  i  (VxVy  —  VyVx  —  V[X,Y])  (s) 

for  all  sections  s  and  vector  fields  X  and  Y. 

Of  course,  one  should  check  that  the  given  expression  for  c j  is  really  a 
2-form,  meaning  that  the  value  of  uj(X,Y)  at  a  point  z  depends  only  on 
the  values  of  X  and  Y  at  z,  and  that  it  does  not  depend  on  the  choice  of 
section  s ,  provided  only  that  s(z)  0.  One  way  to  do  this  is  to  compute  uj 
in  a  local  isometric  trivialization,  as  in  the  following  result.  (See  Exercise  3 
for  a  different  approach.) 

Proposition  23.5  Let  sq  be  a  local  isometric  trivialization  of  L  and  let  0 
be  the  associated  connection  1-form.  Then  the  curvature  2- form  uj  ofX  is 
expressed  locally  as 

uj  =  dO. 

In  particular,  uj  is  a  closed  2-form. 

Proof.  The  computation  is  precisely  the  same  as  in  the  proof  of  Proposition 
22.3  in  the  Euclidean  case.  ■ 

A  locally  defined  1-form  0  satisfying  dO  =  uj  is  called  a  (local)  symplectic 
potential  for  uj.  Our  next  result  says  that  every  symplectic  potential  is  the 
connection  1-form  for  some  local  isometric  trivialization  of  L. 

Proposition  23.6  Let  (L,  V)  be  a  Hermitian  line  bundle  with  connection 
over  N  with  curvature  2- form  uj.  For  each  point  zq  G  N  and  1-form  0 
defined  in  a  neighborhood  U  of  zo  satisfying  dO  =  uj,  there  is  a  subneigh¬ 
borhood  V  C  U  of  zo  and  a  local  isometric  trivialization  of  L  over  V  such 
that  the  connection  1-form  of  the  trivialization  is  0. 

Proof.  Let  Sq  be  any  isometric  trivializing  section  defined  in  a  neighbor¬ 
hood  of  zo  and  let  g  be  the  associated  connection  1-form.  Since  d(rj  —  0)  =  0, 
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there  is  a  subneighborhood  V  C  U  of  zo  on  which  rj  —  0  =  df,  for  some 
smooth  function  /.  If  s i  =  elf  so?  then 

Vx(si)  =  iX(f)eifs0  +  eifVx(s0) 

=  iX(f)elf s0  -  irj(X)e'lfs0 
=  -i(V(X)  -  df(X))Sl. 

Thus,  the  connection  1-form  associated  with  the  local  isometric  trivializa- 
tion  si  is  77  —  df  =  0.  ■ 

Proposition  23.7  If  (L nV1)  and  (L2,V2)  are  Hermitian  line  bundles 
with  connection  over  TV,  let  L\  ®  L2  denote  the  line  bundle  over  N  for 
which  the  fiber  over  x  is  L\^x  0^2,^,  with  the  natural  inner  product  induced 
by  the  inner  products  on  L\^x  and  L2,x-  Then  there  is  a  unique  Hermitian 
connection  V  on  Li  0  L2  with  the  property  that 

Vx(si  <8>  s2)  =  (Vtsi)  <8>  s2  +  si  <g>  (V2xs2), 

for  all  vector  fields  X  on  N  and  all  smooth  sections  s  1  of  L 1  and  S2  of  L2. 
The  curvature  2- form  uj  for  (L 1  0  L2,  V)  is  given  by 


id  —  Ld\  T  CC2, 


where  and  UJ2  are  the  curvature  2-forms  for  (L 1,  V1)  and  (L2,  V2),  re¬ 
spectively. 


The  proof  of  this  proposition  is  a  straightforward  exercise  in  “definition 
chasing”  and  is  left  as  an  exercise  to  the  reader. 

Suppose  that  L  is  a  Hermitian  line  bundle  over  N  with  connection  V 
and  curvature  2-form  uj.  Given  a  loop  7  :  [a,  b]  N ,  we  can  construct  a 
section  s  of  L  that  is  defined  over  7  such  that  the  covariant  derivative  of  s 
in  the  directions  along  7  is  zero.  Indeed,  in  a  local  isometric  trivialization, 
such  a  section  can  be  constructed  as 


s(7(0)  =  exP 


0(7 0))  dt  >  • 


(23.5) 


The  value  of  s  at  the  endpoint  of  the  loop  will  in  general  not  agree  with  the 
value  at  the  starting  point,  but  will  differ  by  multiplication  by  a  constant 
of  absolute  value  1. 


Definition  23.8  The  holonomy  of  a  loop  7  :  [a,  b]  — N  is  the  unique 
constant  a  (of  absolute  value  1)  such  that  s(j(b))  =  as(j(a)),  where  s  is  a 
nonzero  section  defined  over  7  that  is  covariantly  constant  in  the  directions 
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The  value  of  the  holonomy  of  7  is  easily  seen  to  be  independent  of  the 
value  of  s  at  the  starting  point,  provided  this  starting  value  is  nonzero. 

Suppose  that  S'  is  a  compact,  oriented  surface  with  boundary  in  N  whose 
boundary  dS  is  a  loop.  It  is  not  hard  to  show  that  the  holonomy  around 
dS  can  be  computed  as 

holonomy  (dS)  =  exp  |  i  J  .  (23.6) 

Indeed,  if  S  is  contained  in  the  domain  of  a  local  isometric  trivializa- 
tion,  then  this  result  follows  from  (23.5)  by  means  of  Stoke’s  theorem 
(Sect.  21.1.2). 

Now,  if  S  is  a  closed  (i.e.,  boundaryless)  surface,  its  boundary  is  the 
trivial  loop,  which  has  a  holonomy  that  is  trivial,  that  is,  equal  to  1.  (Think 
of  approximating  S  by  a  surface  for  which  the  boundary  is  a  very  small 
loop.)  Thus,  for  any  closed  surface  5,  (23.6)  gives 


exp 


J  ce|  =  1,  dS  =  0. 


(23.7) 


Equivalently,  we  have 


1 

27T 


wGZ, 


(23.8) 


The  condition  (23.8)  says  that  00 /( 2tt)  is  an  integral  2-form.  Clearly,  not 
every  closed  2-form  satisfies  this  property. 

The  closedness  of  uo  (Proposition  23.5)  and  the  condition  (23.8)  represent 
necessary  conditions  that  the  curvature  of  a  Hermitian  connection  must 
satisfy.  It  turns  out  that  these  two  conditions  are  also  sufficient. 


Theorem  23.9  Suppose  uo  is  a  closed  2- form  on  a  manifold  N  for  which 
lo/(2tt)  is  integral  in  the  sense  of  (23.8).  Then  there  exists  a  Hermitian 
line  bundle  L  over  N  with  Hermitian  connection  V  such  that  the  curvature 
of  V  is  equal  to  00.  If ,  in  addition,  N  is  simply  connected,  then  (L,  V)  is 
unique  up  to  equivalence. 

See  Sect.  8.3  of  [45]  for  a  proof  of  this  result.  An  equivalence  of  two 
Hermitian  line  bundles  L\  and  L2  with  Hermitian  connection  over  TV  is  a 
diffeomorphism  <f>  :  L\  -0  L2  such  that  for  each  x  E  N,  the  restriction  of 
<f>  to  7rf1({x})  an  isometric  linear  map  onto  7rf1({x})  and  such  that  for 
each  section  s  of  L\,  we  have 


$(Vx(s))  =  VX($W). 


We  now  have  the  necessary  tools  to  proceed  with  the  program  of  geo¬ 
metric  quantization  on  symplectic  manifolds. 
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23.3  Prequantization 


The  first  step  in  the  program  of  geometric  quantization  for  a  symplectic 
manifold  (TV,  co)  is  to  construct  a  Hermitian  line  bundle  L  over  TV  with 
Hermitian  connection  for  which  the  curvature  2-form  is  equal  to  uj/h.  The¬ 
orem  23.9  gives  the  condition  for  the  existence  of  such  a  bundle. 

Definition  23.10  A  symplectic  manifold  (TV,  co)  is  quantizable  ( for  a 
particular  value  of  h)  if 

for  every  closed  surface  S  in  TV. 

Note  that  if  (TV,  uS)  is  quantizable  for  a  given  value  Hq  of  Planck’s  con¬ 
stant,  then  (TV,  uf)  is  also  quantizable  for  h  =  ho/k  for  every  positive  integer 
k.  Indeed,  according  to  Proposition  23.7,  if  L  is  a  Hermitian  line  bundle 
with  connection  having  curvature  cv/ho,  then  L®k  (the  tensor  product  of 
L  with  itself  k  times)  is  a  Hermitian  line  bundle  with  connection  having 
curvature  cv /(ho/k). 

For  the  remainder  of  this  chapter,  we  will  assume  that  TV  is  a  quantizable 
symplectic  manifold  with  symplectic  form  uj  and  that  (L,  V)  is  a  fixed 
Hermitian  line  bundle  with  connection  of  TV  with  curvature  uj/h. 

If  L  is  a  Hermitian  line  bundle  over  a  symplectic  manifold  TV,  we  say 
that  a  measurable  section  s  of  L  is  square  integrable  if 


1/2 


(si(x),si(x))  X(x) 


is  finite,  where  A  is  the  Liouville  volume  form  on  TV.  Given  two  square- 
integrable  sections  s i  and  82  of  L,  we  define  the  inner  product  of  81  and 
s2  by 

{si,s2)=  [  (s1(x),s2(x))  \(x).  (23.9) 

J  N 

We  use  parentheses  to  denote  the  pointwise  inner  product  (si(x),  82(0?)) 
of  two  sections  81  and  82,  which  is  a  function  on  TV,  and  we  use  angled 
brackets  to  denote  the  global  inner  product  (81,82)  of  the  sections,  which 
is  a  number. 


Definition  23.11  The  prequantum  Hilbert  space  for  TV  is  the  space  of 
equivalence  classes  of  square-integrable  sections  of  L,  where  two  sections  are 
equivalent  if  they  are  equal  almost  everywhere  with  respect  to  the  Liouville 
volume  measure. 

Definition  23.12  If  f  is  a  smooth  complex-valued  function  on  TV,  the  pre¬ 
quantum  operator  Qpre(/)  is  the  unbounded  operator  on  the  prequantum 
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Hilbert  space  given  by 


<2pre(/)  =  iflVXj  +/, 

where  f  represents  the  operation  of  multiplying  a  section  by  f. 

Proposition  23.13  If  f  is  real-valued ,  then  Qpre{f )  is  symmetric  on  the 
space  of  smooth  compactly  supported  sections  of  L. 

Proof.  Let  s i  and  S2  be  smooth,  compactly  supported  sections  of  L  and  let 
denote  the  Hamiltonian  flow  generated  by  /.  For  all  sufficiently  small 
t,  every  point  in  the  supports  of  Si  and  S2  will  contained  in  the  domain  of 
<f>f.  Furthermore,  by  Liouville’s  theorem,  the  value  of 


[(si,s2)  ° 


A 


is  independent  of  t.  If  we  differentiate  this  relation  with  respect  to  t  and 
evaluate  at  t  =  0,  we  obtain,  by  (23.3), 


0—  [  [(Vx/  (si),  S2)  +  (si,  Vx/  (s2))]  A. 

J  N 

Thus,  Vj/  is  a  skew-symmetric  operator  on  the  space  of  smooth,  compactly 
supported  sections,  from  which  it  follows  that  Qpre{f)  is  symmetric.  ■ 

By  the  product  rule  for  covariant  derivatives  and  the  identity  Xf(f)  = 
{/,  /}  =  0,  we  see  that  the  two  terms  in  the  definition  of  Qpre{f )  commute. 
We  would  then  expect  the  exponential  elt(^pre^  to  decompose  as  a  product 
of  two  exponentials.  One  of  these  exponentials  is  just  eltf  and  the  other 
may  be  constructed  as  “parallel  transport  along  the  flow  generated  by  Xf” 
Thus,  if  the  flow  generated  by  Xf  is  complete,  it  is  possible  to  use  Stone’s 
theorem  to  construct  Qpre(f)  as  a  self-adjoint  operator  on  a  domain  that 
includes  the  space  of  smooth  compactly  supported  sections. 

Proposition  23.14  For  any  f,g  E  C°°(X),  we  have 

^■[<2pre(/),epre(s0]  =  Qpre({f,g}), 

where  the  equality  holds  as  operators  on  the  space  of  smooth  sections  of  L. 

Proof.  The  argument  is  precisely  the  same  as  in  Proposition  22.1  in  the 
M2n  case.  ■ 

As  we  have  seen  already  in  Sect.  22.3  in  the  M2n  case,  the  prequantum 
Hilbert  space  is  “too  large”  to  be  considered  the  quantization  of  N. 
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23.4  Polarizations 

In  the  Mn  case,  we  have  the  position,  momentum,  and  holomorphic  sub¬ 
spaces  (Definition  22.7),  consisting  of  functions  that  depend  only  on  x,  p, 
or  z,  in  the  sense  that  the  covariant  derivatives  of  functions  in  the  direc¬ 
tions  of  p,  x,  and  z  are  zero.  In  each  case,  the  “basic  observables”  of  the 
particular  representation  (the  xfi s,  the  pfis,  and  the  zfi  s,  respectively)  act 
simply  as  multiplication  operators. 

To  generalize  this  to  a  symplectic  manifold  N  of  dimension  2 n,  we  may 
think  of  choosing  n  functions  aa, . . . ,  an  on  N  that  are  “independent,”  in 
the  sense  that  daq, . . . ,  dan  are  linearly  independent  at  each  point.  We  as¬ 
sume  that  the  functions  ay  Poisson  commute  ({ay,o/c}  =  0),  which  makes 
it  reasonable  to  hope  that  the  quantizations  of  the  ay’s  could  act  as  (com¬ 
muting)  multiplication  operators.  For  each  z  E  iV,  we  let  Pz  be  the  n- 
dimensional  space  of  directions  in  which  the  ay’s  are  constant,  that  is, 
the  intersection  of  the  kernels  of  doq, . . . ,  dan.  Since  we  wish  to  allow  the 
functions  ay  to  be  complex  valued,  Pz  should  be  thought  of  as  a  subspace 
of  the  complexified  tangent  space  T^(N).  The  idea  is  that  our  quantum 
Hilbert  space  should  consist  of  sections  of  a  prequantum  line  bundle  that 
are  covariantly  constant  in  the  directions  of  P. 

Now,  at  each  point  z,  the  Hamiltonian  vector  field  Xaj  will  belong  to 
Pz,  because 

daj(Xak)  —  Xak[aj)  —  {ak,aj}  —  0. 

Furthermore,  since  the  day’s  are  linearly  independent,  the  Xaj ’s  are  also 
independent,  since  Xaj  is  obtained  from  daj  by  an  isomorphism  of  tangent 
and  cotangent  spaces.  Thus,  the  XOLj ’s  must  actually  span  Pz  at  each  point, 
by  a  dimension  count.  Since  also  oo(Xa. ,  Xak)  =  —  {oy,a/c}  =  0,  we  con¬ 
clude  that  uj  is  identically  zero  on  Pz.  Furthermore,  if  X  and  Y  are  vector 
fields  lying  in  P  at  each  point,  we  can  express  them  as 

X  =  aj(z)Xaj ,  Y  =  bj(z)Xaj , 


for  some  smooth  functions  ey  and  bj.  Then 


[^5^]  QJj(z')Xa.(fik')Xak  bk(z)Xak  (dj)Xa. , 


because  [Xa.  ,Xak\  =  X{a  aky  =  0.  Thus,  the  commutator  of  two  vector 
fields  lying  in  P  will  again  he  in  P. 


Definition  23.15  For  any  z  £  TV,  a  subspace  P  of  TZN  is  said  to  be 
Lagrangian  if  dim  P  =  n  and  cu(X,  Y)  =  0  for  all  XfiY  E  P. 

Definition  23.16  A  polarization  of  a  symplectic  manifold  N  is  a  choice 
at  each  point  z  E  N  of  a  Lagrangian  subspace  Pz  C  Xj'(X),  satisfying  the 
following  two  conditions. 
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1.  If  two  complex  vector  fields  X  and  Y  lie  in  Pz  at  each  point  z,  then 
so  does  [X,  Y}. 

2.  The  dimension  of  Pz  D  Pz  is  constant. 

The  first  condition  is  called  integr ability,  and  we  have  motivated  this 
condition  in  the  discussion  preceding  the  definition.  The  second  condition 
is  a  technical  one  that  prevents  problems  with  certain  constructions,  such 
as  the  pairing  map.  (Although,  in  practice,  one  sometimes  needs  to  work 
with  “polarizations”  in  which  the  second  condition  is  violated,  extra  care 
is  needed  in  such  cases.) 

There  is  one  small  inaccuracy  in  our  discussion  of  polarizations:  For 
purely  conventional  reasons,  the  quantum  Hilbert  space  is  defined  as  the 
space  of  sections  that  are  covariantly  constant  in  the  direction  of  P,  rather 
than  P.  Thus,  P  should  really  be  the  complex  conjugate  of  the  space  of 
directions  in  which  the  sections  are  constant.  This  convention,  however, 
makes  no  difference  to  the  definition  of  a  polarization,  since  if  P  satisfies 
the  conditions  of  Definition  23.16,  so  does  P. 

Example  23.17  If  M  is  any  smooth  manifold ,  let  N  =  T*M  be  the  cotan¬ 
gent  bundle  of  M,  equipped  with  the  canonical  2- form  cv  (Example  21.2). 
For  each  z  E  T*M,  let  Pz  be  the  complexification  of  the  tangent  space 
to  the  fiber  TfM.  Then  P  is  a  polarization  on  T*M ,  called  the  vertical 
polarization. 

Proof.  If  {xj}  is  any  local  coordinate  system  on  M,  let  {xj,pj}  be  the 
associated  local  coordinate  system  on  T*M.  The  canonical  2-form  is  given 
by  uj  =  dpj  A  dxj.  At  each  point  z  E  T*M,  the  vertical  subspace  Pz  is 
spanned  by  the  vectors  d/d Pj.  Since  cv(d/dpj,  d/dpk)  =  0,  we  see  that  Pz 
is  Lagrangian.  Furthermore,  Pz  =  Pz  at  every  point,  and  so  dim  Pz  D  Pz 
has  the  constant  value  n  =  dim  M.  Finally,  the  integr  ability  of  P  follows  by 
computing  the  commutator  of  two  vector  fields  of  the  form  fj(pc,p)  d/dp j, 
which  will  again  be  a  linear  combination  of  the  d/dpf  s.  Integrability  also 
follows  from  the  easy  direction  of  the  Frobenius  theorem,  since  the  fibers 
of  T*M  are  integral  submanifolds  for  P.  ■ 

We  may  identify  two  special  classes  of  polarizations,  those  that  are  purely 
real  (i.e.,  Pz  =  Pz  for  all  z  E  N )  and  those  that  are  purely  complex  (i.e., 
Pz  d)  Pz  =  {0}  for  all  z  E  N ).  The  vertical  polarization,  for  example,  is 
purely  real. 

If  P  is  purely  real,  the  integrability  of  P  implies,  by  the  Frobenius  theo¬ 
rem,  that  every  point  in  N  is  contained  in  a  unique  submanifold  R  that  is 
maximal  in  the  class  of  connected  integral  submanifolds  for  P.  [An  integral 
submanifold  R  for  P  is  submanifold  for  which  Ty(P)  =  Pz  for  all  z  E  R.\ 
We  will  refer  to  the  maximal  connected,  integral  submanifolds  of  a  purely 
real  polarization  as  the  leaves  of  the  polarization. 

In  general,  the  leaves  may  not  be  embedded  submanifolds  of  N.  Suppose, 
for  example,  that  N  =  S'1  x  S'1,  with  uj  =  dOAdcf),  where  6  and  <p  are  angular 
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coordinates  on  the  two  copies  of  S1.  Then  the  tangent  space  to  N  at  any 
point  may  be  identified  with  M2  by  means  of  the  basis  {d/80,  d/d<f}.  We 
may  define  a  polarization  P  on  N  by  defining  Pz  to  be  the  span  of  the 
vector 


d 

80 


for  some  fixed  irrational  number  a.  Each  leaf  of  P  is  then  a  set  of  the  form 


eiat)  G  S1  x  S^te  M} 


1 


for  some  Oq,  which  is  an  “irrational  line”  in  S 1  x  S1.  Each  leaf  is  then 
dense  in  S 1  x  S'1  and,  thus,  not  embedded.  We  will  need  to  avoid  such 
pathological  examples  if  we  hope  to  successfully  carry  out  the  program 
of  geometric  quantization  with  respect  to  a  real  polarization.  Much  more 
information  about  the  structure  of  real  polarizations  may  be  found  in  Sects. 
4. 5-4. 7  of  [45]. 

We  now  consider  some  elementary  results  concerning  purely  complex 
polarizations. 


Proposition  23.18  Suppose  P  is  a  purely  complex  polarization  on  N.  For 
each  z  G  TV,  let  Jz  :  T^N  T^N  be  the  unique  linear  map  such  that  Jz  = 
il  on  Pz  and  Jz  =  —il  on  Pz.  Then  Jz  is  real  (i.e.,  it  maps  the  real  tangent 
space  to  itself)  and  uj  is  Jz-invariant  [i.e.,  uj(JzX\,  JZX 2)  =  lj(Xi,X2)  for 
all  XUX2  eTfN]. 

Proof.  Since  the  restriction  of  Jz  to  Pz  is  the  complex-conjugate  of  its 
restriction  to  Pz,  the  map  Jz  commutes  with  complex  conjugation  and  thus 
maps  real  vectors  (those  satisfying  X  =  X)  to  real  vectors.  Meanwhile, 
since  Pz  is  Lagrangian  and  uj  is  real,  Pz  is  also  Lagrangian.  Given  two 
vectors  X\  —  Y\  +  Z\  and  X2  =  Y2  +  Z2,  with  Yj  G  Pz  and  Z3  G  Pz,  we 
compute  that 


u{JzXu  JZX 2) 

=  u(iYi,iY2)  +  u{iYi,  —iZ2)  +  iY2)  +  w(—iZi,  —iZ2) 

=  ce(Yi,  Z2)  +  w(Zi,  Y2). 


A  similar  calculation  gives  the  same  value  for  cj(Xi,X2),  showing  that  uj 
is  Jz -invariant.  ■ 

A  complex  structure  on  a  2n-dimensional  manifold  A  is  a  collection  of 
“holomorphic”  coordinate  systems  that  cover  N  and  such  that  the  transi¬ 
tion  maps  between  coordinate  systems  are  holomorphic  as  maps  between 
open  sets  in  M2n  =  Cn.  At  each  point  x  G  N,  there  is  a  linear  map 
Jz  :  TZN  TZN  defined  by  the  expression 
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where  the  s  and  y^s  are  the  real  and  imaginary  parts  of  holomorphic 
coordinates.  This  map  is  independent  of  the  choice  of  holomorphic  coordi¬ 
nates  and  satisfies  J2  =  —  I.  At  each  point  z  G  TV,  the  complexified  tangent 
space  TfN  can  be  decomposed  into  eigenspaces  for  Jz  with  eigenvalues  i 
and  —  i\  these  are  called  the  (1,0)-  and  (0,  l)-tangent  spaces,  respectively. 

Meanwhile,  if  TV  is  any  2n-dimensional  manifold  and  J  is  a  smoothly 
varying  family  of  linear  maps  on  each  tangent  space  satisfying  J?  =  —  I  for 
all  z,  then  J  is  called  an  almost- complex  structure.  Given  an  almost  complex 
structure,  we  can  divide  the  complexified  tangent  space  into  Pi  eigenspaces 
for  J.  The  Newlander-Nirenberg  theorem  asserts  that  if  the  family  of  -\-i 
eigenspaces  is  integrable  (in  the  sense  of  Point  1  of  Definition  23.16),  then 
there  exists  a  unique  complex  structure  on  TV  for  which  these  are  the  (1,0)- 
tangent  spaces. 

A  purely  complex  polarization  P  gives  rise  to  a  complex  structure  on  TV, 
as  follows.  By  Proposition  23.18  and  the  Newlander-Nirenberg  theorem, 
there  is  a  unique  complex  structure  on  TV  for  which  Pz  is  the  (1,  0)-tangent 
space,  for  all  z  G  TV. 

Now,  we  have  already  seen  in  the  M2n  case  that  some  purely  complex 
polarizations  behave  better  than  others.  [Compare  (22.11)  to  (22.13)].  The 
geometric  condition  that  characterizes  the  “good”  polarizations  is  the  fol¬ 
lowing. 

Definition  23.19  For  any  purely  complex  polarization  P,  let  J  be  the 
unique  almost- complex  structure  on  TV  such  that  Jz  =  il  on  Pz  and  Jz  = 
—il  on  Pz.  We  say  that  P  is  a  Kohler  polarization  if  the  bilinear  form 

g(X,Y)  :=l v(X,JzY)  (23.10) 

is  positive  definite  for  each  z  E  N. 

For  any  purely  complex  polarization,  the  bilinear  form  g  in  (23.10)  is 
symmetric,  as  the  reader  may  easily  verify  using  the  Jz -invariance  of  uj. 

Suppose,  for  example,  that  we  identify  M2  with  C  by  the  map  z  =  x  —  iap , 
for  some  fixed  a  >  0.  If  we  define  a  purely  complex  polarization  on  M2  by 
taking  Pz  to  be  the  span  of  the  vector  d/dz  in  (22.9),  then  (Exercise  4),  P 
is  a  Kahler  polarization. 


23.5  Quantization  Without  Half-Forms 

To  construct  a  prequantum  Hilbert  space,  we  must  choose  a  line  bundle 
(L,  V)  over  (TV,  a;)  having  curvature  oj/h.  Such  a  bundle  exists  if  u/h  is 
an  integral  2-form  and  is  unique  (up  to  equivalence)  if  N  is  simply  con¬ 
nected.  To  pass  to  the  quantum  Hilbert  space,  we  must  make  a  substantial 
additional  choice,  that  of  a  polarization  P  on  N.  In  our  first  attempt  at 
defining  the  quantum  Hilbert  space  associated  with  P,  we  consider  the 
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space  of  sections  of  (L,  V)  that  are  covariant ly  constant  in  the  directions 
of  P.  Although  this  approach  works  reasonably  well  for  a  purely  complex 
polarization,  in  the  case  of  a  purely  real  polarization,  there  typically  are  no 
square-integrable  sections  satisfying  this  condition.  (Indeed,  we  have  seen 
this  problem  already  in  the  M2n  case,  in  Sect.  22.4.)  In  the  next  section,  we 
will  introduce  half-forms  to  address  this  problem. 

In  the  remainder  of  the  chapter,  we  will  let  P  denote  a  fixed  polarization 
on  N. 


23.5.1  The  General  Case 

As  we  have  remarked,  it  is  customary  to  consider  sections  that  are 
covariantly  constant  in  the  directions  of  P  rather  than  in  the  directions 
of  P. 

Definition  23.20  A  smooth  section  s  of  L  is  polarized  (with  respect  to 

P)if 

Vxs  =  0  (23.11) 

for  every  vector  field  X  lying  in  P.  The  quantum  Hilbert  space  associated 
with  P  is  the  closure  in  the  prequantum  Hilbert  space  of  the  space  of  smooth, 
square-integrable,  polarized  sections  of  L. 

As  in  the  Euclidean  case,  we  will  simply  restrict  the  prequantum  opera¬ 
tors  to  the  quantum  Hilbert  space,  in  those  cases  where  Qpre(/)  preserves 
the  space  of  polarized  sections. 

Definition  23.21  A  smooth,  complex- valued  function  f  on  N  is  quanti- 
zable  with  respect  to  P  if  QpTe(f)  preserves  the  space  of  smooth  sections 
that  are  polarized  with  respect  to  P. 

The  following  definition  will  provide  a  natural  geometric  condition  guar¬ 
anteeing  quant iz ability  of  a  function. 

Definition  23.22  A  possibly  complex  vector  field  X  preserves  a  polar¬ 
ization  P  if  for  every  vector  field  Y  lying  in  P ,  the  vector  field  [X,  Y]  also 
lies  in  P. 

Note  that  if  X  lies  in  P ,  then  X  preserves  P ,  by  the  integr ability  assump¬ 
tion  on  P.  There  will  typically  be,  however,  many  vector  fields  that  do  not 
lie  in  P  but  nevertheless  preserve  P. 

If  A  is  a  real  vector  field,  then  [X,  Y]  is  the  same  as  the  Lie  derivative 
Cx(Y).  It  is  then  not  hard  to  show  that  X  preserves  P  if  and  only  if  the 
flow  generated  by  X  preserves  P ,  that  is,  if  and  only  if  (<fq)*(P2)  =  P$t(z) 
for  all  z  and  t,  where  <f>  is  the  flow  of  X.  Furthermore,  if  X  is  real,  then  X 
preserves  P  if  and  only  if  X  preserves  P. 
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Example  23.23  If  N  =  T*M  for  some  manifold  M  and  P  is  the  vertical 
polarization  on  TV,  then  a  Hamiltonian  vector  field  Xf  preserves  P  if  and 
only  if  f  =  fi  +  /2,  where  fi  is  constant  on  each  fiber  and  f 2  is  linear  on 
each  fiber. 

Proof.  In  local  coordinates  {xj,pj},  a  vector  field  X  lying  in  P  has  the 
form  X  =  gj  d/dpj.  Thus, 


[Xf,X] 


'df  d  d  } 

r df  o  a  1 

r\  7  9k 

OPj  OXj  opk 

'•'j  *•')  7  9k  ^ 

OXj  OPj  opk 

This  commutator  will  consist  of  three  “good”  terms,  which  involve  only 
p-derivatives,  along  with  the  following  “bad”  term: 

d2f  d 

9k  000 
apkOPj  OXj 

If  d2  f  /dpkdpj  is  0  for  all  j  and  &,  then  the  bad  term  vanishes  and  [Xf,X] 
again  lies  in  P.  Conversely,  if  we  want  the  bad  term  to  vanish  for  each 
choice  of  the  coefficient  functions  gj,  we  must  have  d2  f  /dpkdpj  =  0  for  all 
j  and  k.  Thus,  for  each  fixed  value  of  x,  f  must  contain  only  terms  that 
are  independent  of  p  and  terms  that  are  linear  in  p.  ■ 

We  now  identify  the  condition  for  quantizability  of  functions. 


Theorem  23.24  For  any  smooth ,  complex- valued  function  f  on  N,  if  the 
Hamiltonian  vector  field  Xf  preserves  P,  then  f  is  quantizable. 

Since  we  do  not  assume  that  /  is  real- valued,  the  condition  that  Xf 
preserve  P  is  not  equivalent  to  the  condition  that  Xf  preserve  P. 

Proof.  Given  a  polarized  section  s,  we  apply  Qpre(/)  to  s  and  then  test 
whether  QpVe(f)s  is  still  polarized,  by  applying  Vx  for  some  vector  field 
X  lying  in  P.  To  this  end,  it  is  useful  to  compute  the  commutator  of  Vx 
and  Qpre(/),  as  follows: 


[Vx,  Qpre(/)]  =  iti  [Vx,Vx;]  +  [Vx,  /] 

=  ih  H[x,Xf]-^(x,xf))+x(f) 

=  ifiV^X;]) 


(23.12) 


where  we  have  used  that 

u(X,Xf)  =  -u>(Xf,X)  =  —df  (X)  =  -X(f), 

by  Definition  21.6.  Since  Xf  preserves  P,  the  vector  field  [X,Xf]  again  lies 
in  P  and,  thus, 


Vx(Qpre(/»  =  Qpre(/)VxS  +  iUV[X,Xf]S  = 
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for  every  polarized  section  s ,  showing  that  Qpre(f)s  is  again  polarized.  ■ 

The  converse  of  Theorem  23.24  is  false  in  general.  After  all,  as  we  will  see 
in  the  following  subsections,  for  a  given  polarization,  there  may  not  be  any 
nonzero  globally  defined  polarized  sections,  in  which  case,  any  function  is 
quantizable.  On  the  other  hand,  it  can  be  shown  that  if  Qpre(/)  preserves 
the  space  of  locally  defined  polarized  sections,  then  the  Hamiltonian  flow 
generated  by  /  must  preserve  P.  This  result  follows  by  the  same  reasoning 
as  in  the  proof  of  Theorem  23.24,  once  we  know  that  there  are  sufficiently 
many  locally  defined  polarized  sections.  We  will  establish  such  an  existence 
result  for  purely  real  and  purely  complex  polarizations  in  the  following 
subsections;  for  the  general  case,  see  the  discussion  following  Definition 
9.1.1  in  [45]. 

A  special  case  of  Theorem  23.24  is  provided  by  “polarized  functions,” 
that  is,  functions  /  for  which  X(f)  =  0  for  all  vector  fields  X  lying  in 
P.  For  such  an  /,  the  action  of  Qpre(/)  on  the  quantum  space  is  simply 
multiplication  by  /,  as  we  anticipated  in  the  introductory  discussion  in 
Sect.  23.4. 

Proposition  23.25  If  f  is  a  smooth ,  complex- valued  function  on  N  and 
the  derivatives  of  f  in  the  P  directions  are  zero,  then  Qpre(/)  preserves  the 
space  P -polarized  sections,  and  the  restriction  of  Qpre(/)  to  this  space  is 
simply  multiplication  by  /. 

We  have  already  seen  special  cases  of  this  result  in  the  M2n  case;  see  the 
discussion  following  Proposition  22.11. 

Proof.  If  the  derivatives  of  /  in  the  direction  of  P  are  zero,  then  for  IgP, 
we  have 

0  =  X(f)  =  df(X)=L0(Xf,X), 

meaning  that  Xf  is  in  the  ca-orthogonal  complement  of  P.  But  since  P 
is  Lagrangian,  this  complement  is  just  P .  Thus,  Xf  belongs  to  P  and,  in 
particular,  Xf  preserves  P ,  so  that  /  is  quantizable,  by  Theorem  23.24. 
Furthermore,  =  0  for  any  P-polarized  section  s ,  leaving  only  the  fs 

term  in  the  formula  for  Qpre(f)s.  ■ 

23.5.2  The  Real  Case 

In  the  M2n  case,  we  have  already  computed  the  space  of  polarized  sections 
for  the  vertical  polarization  in  Proposition  22.8.  As  we  observed  there,  there 
are  no  nonzero  polarized  sections  that  are  square  integrable  over  M2n.  The 
same  difficulty  is  easily  seen  to  arise  for  the  vertical  polarization  on  any 
cotangent  bundle  N  =  T*M.  In  Sect.  23.6,  we  will  introduce  half- forms  to 
deal  with  this  failure  of  square  integrability. 

We  now  examine  properties  of  general  real  polarizations.  We  will  see  that 
polarized  sections  always  exist  locally,  but  not  always  globally. 
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Proposition  23.26  If  P  is  a  purely  real  polarization  on  AT,  then  for  any 
z o  G  A,  there  exist  a  neighborhood  U  of  zq  and  a  P -polarized  section  s  of 
L  defined  over  U  such  that  s(zq)  ^  0. 

Proof.  According  to  the  local  form  of  the  Frobenius  theorem,  we  can  find 
a  neighborhood  U  of  zq  and  a  diffeomorphism  <f>  of  U  with  a  neighborhood 
V  of  the  origin  in  W1  x  W1  such  that  under  <£>,  the  polarization  P  looks  like 
the  vertical  polarization.  That  is  to  say,  for  each  z  G  C7,  the  image  of  Pz 
under  <f>*(z)  is  just  the  span  of  the  vectors  d/dyi , . . . ,  d / dyn,  where  the  y' s 
are  the  coordinates  on  the  second  copy  of  Mn.  By  shrinking  U  if  necessary, 
we  can  assume  that  L  can  be  trivialized  over  U  and  that  the  open  set  V  is 
the  product  of  a  ball  B\  centered  at  the  origin  in  the  first  copy  of  Mn  with 
a  ball  F>2  centered  at  the  origin  in  the  second  copy  of  Mn. 

Let  6  be  the  connection  1-form  for  an  isometric  trivialization  of  L  over 
U  and  let  6  =  (<F-1)*(0).  Since  the  subspaces  Pz  are  Lagrangian,  the 
restriction  of  0  to  the  each  set  of  the  form  {x}  x  B 2  is  closed.  Since  B2 
is  simply  connected,  there  exists,  for  each  x  G  F>i,  a  function  /x  on  B2 
such  that  the  restriction  of  6  to  {x}  x  B2  equals  dfx.  If  we  assume  that 
/x(0)  =  0,  then  /x(y)  will  be  smooth  as  a  function  of  (x, y),  since  it  is 
obtained  simply  by  integrating  6  from  0  to  y  in  the  vertical  directions. 

Now,  let  p  be  any  smooth  function  on  B\  with  0(0)  7^  0  and  define  a 
function  p  on  B\  x  B2  by 

0(x,  y)  =  p{x.)e'l^x(<y^h . 

For  any  “vertical”  vector  field  X  (i.e.,  one  where  A  is  a  linear  combination 
of  d/dyi, . . . ,  d/dyn  with  smooth  coefficients),  we  compute  that 

x =  pUxn  =  p(x)ip. 

Thus, 

(x  -  l-0(X))  -ip  =  0, 

/\ 

from  which  it  follows  that  the  function  p  :=  p  o  <f>  represents  a  polarized 
section  on  U  in  the  given  local  trivialization  of  L.  m 

The  existence  of  nonzero  global  polarized  sections  for  a  purely  real  po¬ 
larization  P  is  a  more  delicate  question.  If  the  leaves  of  P  are  not  embed¬ 
ded,  there  is  little  chance  of  finding  global  polarized  sections.  Even  if  the 
leaves  are  embedded,  there  are  obstructions.  Since  the  tangent  spaces  to 
the  leaves  of  P  are  Lagrangian  subspaces,  the  restriction  of  L  to  R  has  zero 
curvature.  There  may,  nevertheless,  be  loops  in  R  for  which  the  holonomy 
(Definition  23.8)  is  nontrivial.  After  all,  if  a  loop  7  in  R  is  not  the  bound¬ 
ary  of  a  surface  S  in  R ,  then  we  cannot  apply  (23.6)  to  conclude  that  the 
holonomy  of  7  is  trivial.  The  collection  of  holonomies  for  a  leaf  R  of  P  can 
be  understood  as  a  homomorphism  of  tti(R)  into  S1.  If  there  is  any  loop  in 
R  with  nontrivial  holonomy,  any  polarized  section  of  L  must  vanish  on  R. 
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Definition  23.27  A  submanifold  R  of  N  is  said  to  be  Lagrangian  if  dim 
R  =  n  and  TZR  is  a  Lagrangian  subspace  of  TZN  for  each  z  £  R.  A 
Lagrangian  submanifold  R  of  N  is  said  to  be  Bohr-Sommerfeld  (with 
respect  to  L)  if  the  holonomy  in  L  of  every  loop  in  R  is  trivial. 

We  may  summarize  the  preceding  discussion  as  follows. 

Conclusion  23.28  For  a  purely  real  polarization  P  with  embedded  leaves, 
a  polarized  section  vanishes  on  every  leaf  of  P  that  is  not  Bohr-Sommerfeld. 

Our  next  example  suggests  that  when  the  leaves  are  compact,  the  Bohr- 
Sommerfeld  leaves  typically  form  a  discrete  set  within  the  set  of  all  leaves. 

Example  23.29  Let  TV  =  S'1  x  R,  equipped  with  the  symplectic  form  uj  = 
dxAdf,  where  x  is  the  linear  coordinate  on  R  and  <f  is  the  angular  coordinate 
on  Sl .  Let  L  be  the  trivial  line  bundle  on  TV,  with  sections  that  are  identified 
with  smooth  functions.  Let  0  =  x  df>  and  define  a  connection  V  on  L  by 
Vx  =  TV  —  (i/h)0(X),  and  let  P  be  the  purely  real  polarization  of  N  for 
which  the  leaves  are  the  sets  of  the  form  S 1  x  { x },  for  xGi  Then  a  leaf 
S 1  x  {x}  is  Bohr-Sommerfeld  if  and  only  if  x/h  is  an  integer. 

In  particular,  there  are  no  nonzero,  smooth  polarized  sections  of  L. 

Proof.  If  we  define  a  section  locally  on  a  given  leaf  S'1  x  {x}  as 

s(f)  =  ceix^h 

for  some  nonzero  constant  c,  then  it  is  easily  verified  that  V d/d(ps  =  0.  After 
one  trip  around  the  circle,  the  value  of  this  section  will  be  the  starting  value 
times  e27Tlx/h .  Thus,  the  holonomy  around  S'1  x  {x}  is  trivial  if  and  only  if 
x/h  is  an  integer.  A  polarized  section,  then,  would  have  to  vanish  on  all  the 
leaves  where  x/h  is  not  an  integer.  Since  such  leaves  form  a  dense  subset 
of  TV,  any  smooth  polarized  section  must  be  identically  zero.  ■ 

Even  in  cases,  such  as  Example  23.29,  where  there  are  no  smooth  po¬ 
larized  sections,  one  may  still  consider  “distributional”  polarized  sections 
that  are  supported  on  the  Bohr-Sommerfeld  leaves,  as  on  pp.  251-252  of 
[45]. 

23.5.3  The  Complex  Case 

In  Proposition  22.8,  we  computed  the  space  of  polarized  sections  for  a  cer¬ 
tain  positive,  translation-invariant  polarization  on  M2n,  namely  the  one  for 
which  Pz  is  spanned  by  the  vectors  d/dzj  in  (22.9).  The  situation  here 
is  better  than  that  for  the  vertical  polarization,  in  that  there  are  nonzero 
polarized  sections  that  are  square  integrable  over  M2n.  Recall,  however, 
that  if  we  take  our  polarization  to  be  spanned  by  the  vectors  d/dzj ,  then 
see  (22.13)],  then  there  are  no  nonzero  square-integrable  polarized  sec¬ 
tions.  This  example  indicates  the  importance  of  the  positivity  condition  in 
Definition  23.19. 
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For  our  next  example,  we  consider  the  example  of  the  unit  disk  Z), 
equipped  with  the  unique  (up  to  a  constant)  symplectic  form  that  is  in¬ 
variant  under  the  group  of  fractional  linear  transformations  that  map  D 
onto  D.  In  this  case,  the  quantum  Hilbert  space  can  be  identified  with  a 
weighted  Bergman  space ,  that  is,  an  Z2  space  of  holomorphic  functions  on 
D  with  respect  to  a  measure  of  the  form  (1  —  \z\2)udx  dy. 


Example  23.30  Let  N  be  the  unit  disk  Del2  equipped  with  the  following 
symplectic  form: 


uj  =  4(1 


z |2)  2  dx  A  dy  =  (1  —  r2)  2r  dr  A  d</>, 


where  (r,  <f>)  are  the  usual  polar  coordinates.  Let  L  be  the  trivial  line  bun¬ 
dle  over  D  with  connection  Vx  =  X  —  ( i/h)0 ,  where  6  is  the  symplectic 
potential  for  uj  given  by 

2 

6  =  2 - -  dxj). 

1  —  rz 

Define  a  complex  polarization  on  D  by  letting  Pz  =  Span(<9/dz),  where 
z  =  x  —  iy.  In  that  case ,  holomorphic  sections  s  have  the  form 


s(z)  =  F(z)(  1 


1 


where  F  is  holomorphic.  The  norm  of  such  a  section  is  computed  as 


)2/h  2  dx  dy. 


As  in  the  case  of  the  plane,  the  seemingly  unnatural  definition  z  =  x  —  iy 
is  necessary  to  obtain  a  Kahler  polarization.  If  we  used  z  =  x-\-iy  instead, 
the  holomorphic  sections  would  have  the  form  F(z)(  1  —  |z|2)-1/^,  in  which 
case  there  would  be  no  nonzero,  square-integrable  holomorphic  sections. 
Proof.  See  Exercise  8.  ■ 

We  now  consider  general  purely  complex  polarizations.  Recall  that,  by 
Proposition  23.18  and  the  Newlander-Nirenberg  theorem,  N  has  a  unique 
complex  structure  for  which  Pz  is  the  (1,  0)-subspace  of  ZTpTV,  for  all  z  E  N. 
As  in  the  purely  real  case,  there  always  exist  local  polarized  sections. 


Theorem  23.31  Suppose  P  is  a  purely  complex  polarization  on  N.  Then 
for  each  zq  e  TV,  there  exists  a  P -polarized  section  s  of  L,  defined  in  a 
neighborhood  of  zo,  such  that  s(zo)  ^  0. 

We  defer  the  proof  of  Theorem  23.31  until  the  end  of  this  subsection. 

Suppose  s  is  as  in  the  theorem  and  s'  is  any  other  locally  defined  P- 
polarized  section.  Then  s'  =  fs  for  some  unique  complex- valued  function  /, 
and  by  the  product  rule  for  covariant  derivatives,  X(f)  =  0  for  all  X  E  Pz. 
This  means  that  /  is  holomorphic  with  respect  to  the  complex  structure 
on  N  for  which  P  is  the  (1,  0)-tangent  space.  Thus,  we  have  a  preferred 
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family  of  local  trivializations  of  L  (the  ones  given  by  nonvanishing  local 
polarized  sections)  such  that  the  “ratio”  of  any  two  such  trivializations  is 
a  holomorphic  function.  This  means  that  we  have  given  L  the  structure  of 
a  “holomorphic  line  bundle”  over  the  complex  manifold  N  in  such  a  way 
that  the  holomorphic  sections  of  L  are  precisely  the  polarized  sections  with 
respect  to  P. 

Arguing  as  in  the  proof  of  Proposition  14.15,  it  is  not  hard  to  show  that 
for  a  purely  complex  polarization,  the  space  of  square-integrable  polarized 
sections  of  L  forms  a  closed  subspace  of  the  prequantum  Hilbert  space.  For 
any  z  £  TV,  if  we  choose  a  linear  identification  of  the  fiber  of  L  over  z  with 
C,  then  the  map  s  K  s(z)  is  a  linear  functional  on  the  quantum  Hilbert 
space.  It  is  not  hard  to  show,  as  in  the  proof  of  Proposition  14.15,  that 
this  linear  functional  is  continuous,  and  can  therefore  be  represented  as  an 
inner  product  with  a  unique  element  of  the  quantum  Hilbert  space. 

Definition  23.32  Let  P  be  a  purely  complex  polarization  on  TV.  For  each 
zeN,  choose  a  linear  identification  of  the  fiber  of  L  over  z  with  C.  Then 
the  coherent  state  Xz  is  the  unique  element  of  the  quantum  Hilbert  space 
with  respect  to  P  such  that 


s(z)  =  ( Xz,s ) 

for  all  s. 

Suppose  TV  =  M2  with  a  polarization  given  by  Pz  =  Span (d/dz),  where 
z  =  x  —  iap.  If  we  use  the  symplectic  potential  6  =  (p  dx  —  x  dp)/ 2 , 
then,  as  in  the  proof  of  Proposition  22.14,  the  quantum  Hilbert  space  is 
naturally  identifiable  with  the  Segal-Bargmann  space.  In  this  case,  the 
coherent  states  can  be  read  off  from  Proposition  14.17. 

It  could  happen  that  Xz  =  0  for  some  z  £  TV,  or  even  for  all  z  £  IV, 
depending  on  the  choice  of  P.  Even  if  Xz  is  nonzero,  Xz  is  only  well  defined 
up  to  multiplication  by  a  constant,  because  we  must  choose  an  identification 
of  L-1({z})  with  C.  But  if  Xz  7^  0,  the  one-dimensional  subspace  spanned 
by  Xz  is  independent  of  this  choice.  That  is  to  say,  whenever  Xz  0,  the 
span  of  Xz  is  a  well-defined  element  of  the  projective  space  P(H),  where 
H  is  the  quantum  Hilbert  space. 

Recall,  meanwhile,  that  if  (L,V)  is  a  Hermitian  line  bundle  with  con¬ 
nection  having  curvature  uj/h,  then  for  any  positive  integer  n,  there  is  a 
natural  Hermitian  connection  on  L®k  having  curvature  kuo/h.  This  means 
that  if  L  is  a  prequantum  line  bundle  with  one  value  ho  of  Planck’s  con¬ 
stant,  then  L®k  is  a  prequantum  line  bundle  with  Planck’s  constant  equal 
to  ho/k.  The  following  result  shows  that  in  the  case  of  compact  symplectic 
manifolds  with  Kahler  polarizations,  things  behave  nicely  when  k  tends  to 
infinity. 

Theorem  23.33  Assume  TV  is  compact  and  let  P  be  a  Kahler  polarization 
on  N.  For  each  positive  integer  fc,  let  HR  denote  the  space  of  polarized 
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sections  of  L®k.  Then  for  all  k,  FLk  is  finite  dimensional  Furthermore,  for 
all  sufficiently  large  k,  we  have  the  following  results.  First,  the  coherent 
state  Xz  £  Hfc  is  nonzero  for  each  z  E  N.  Second,  the  map 

z  i-a  Span(x«) 

is  an  antiholomorphic  embedding  of  N  into  v{nk). 


The  finite  dimensionality  of  is  a  standard  result  in  the  theory  of  com¬ 
pact,  complex  manifolds.  The  embedding  of  N  into  V(/Hk)  is  the  Kodaira 
embedding  theorem ,  which  we  will  not  prove  here.  The  Kodaira  embedding 
theorem  implies,  in  particular,  that  there  exist  nonzero,  globally  defined 
polarized  sections  of  L®k ,  at  least  for  large  k.  Since  the  value  of  Planck’s 
constant  for  L®k  is  ho/k,  Planck’s  constant  tends  to  zero  as  k  tends  to 
infinity.  Thus,  the  study  of  holomorphic  sections  of  L®k  for  large  k  can  be 
understood  as  being  part  of  semiclassical  analysis. 

We  now  turn  to  the  proof  of  Theorem  23.31,  in  which  we  will  make 
use  of  basic  properties  of  complex-valued  differential  forms  on  complex 
manifolds.  ( “Complex- valued”  means  that  we  allow  the  value  of  a  k- form  on 
a  collection  of  k  tangent  vectors  to  be  a  complex  number.)  In  a  holomorphic 
local  coordinate  system  z\,. . .  ,zn,  each  form  can  be  written  as  a  wedge 
product  of  the  dzf  s  and  dzf  s.  A  form  is  called  a  (p,  g)-form  if  it  is  a 
linear  combination  of  wedge  products  of  p  factors  involving  the  dzf  s  and 
q  factors  involving  the  dzf  s.  Each  form  can  be  decomposed  uniquely  as  a 
linear  combination  of  (p,  g)-forms  for  various  values  of  p  and  q ,  and  this 
decomposition  does  not  depend  on  the  choice  of  holomorphic  coordinate 
system.  If  a  is  a  (p,  <7)-form,  then  da  will  be  a  linear  combination  of  a 
(p  +  1,  g)-form  and  a  (p,  q  +  l)-form.  We  define  operators  d  and  d  in  such 
a  way  that  d  maps  (p,  g)-forms  to  (p  +  1,  g)-forms,  d  maps  (p,  g)-forms  to 
(p,  q  +  1)  forms,  and  d  =  d  +  d.  In  particular, 


d (/  dzh  A  •  •  •  A  dzjp  A  dzkl  A  •  •  •  A  dzkq) 


A  dzkl  A  •  •  •  A  dzkq 


and  similarly  for  d  with  (df /dzf  dzi  replaced  by  (df /dzf  dzi. 
The  maps  d  and  d  satisfy  the  identities: 


dd  —  dd  —  0 
dd  =  —dd. 


The  Dolbeault  lemma  states  that  if  a  (p,  g)-form  a  satisfies  da  =  0,  then  a 
can  be  expressed  locally  as  d/3  for  some  (p—  1,  g)-form,  and  if  da  =  0,  then 
a  can  be  expressed  locally  as  dfd  for  some  (p,  q  —  l)-form.  A  (p,  0)-form  a 
is  said  to  be  holomorphic  if  it  can  be  expressed  in  holomorphic  coordinates 
as  a  sum  of  terms  of  the  form 


f(z)  dzh  A  •••A  dzjp, 
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where  the  coefficient  functions  /  is  holomorphic.  A  (p,  0)-form  a  is  holomor- 
phic  if  and  only  if  da  =  0.  If  a  holomorphic  (p,  0)-form  a  satisfies  da  =  0 
(or,  equivalently,  da  =  0),  then  a  can  be  written  locally  as  a  =  d/3,  for 
some  holomorphic  (p  —  l,0)-form. 

Let  P  be  a  purely  complex  polarization  on  N  and  let  J  be  the  almost- 
complex  structure  for  which  Pz  is  the  (1, 0)-tangent  space  at  z.  Since 
(Proposition  23.18),  uj  is  J-invariant,  it  follows  (Exercise  6)  that  uj  is  a 
(1,  l)-form. 

Lemma  23.34  Let  N  be  a  complex  manifold  with  almost- complex  struc¬ 
ture  J  and  let  uj  be  a  closed,  J-invariant,  real-valued  (l,l)-form  on  N.  Then 
for  every  point  z o  G  N,  there  exists  a  smooth,  real-valued  function  n  defined 
in  a  neighborhood  of  zq  such  that  iddn  =  uj. 


In  the  case  that  N  is  Kahler  [i.e. ,  the  case  where  uj(X,JX)  >  0],  a 
function  n  as  in  the  lemma  is  called  a  (local)  Kahler  potential  for  N. 
Proof.  By  assumption,  duj  =  (d  +  B)uj  =  0,  from  which  it  follows  that 
duj  =  Buj  =  0,  because  duo  is  a  (2,  l)-form  and  duo  is  a  (1,2)  form.  Thus,  by 
the  Dolbeault  lemma,  there  exists  a  (1,  0)-form  a,  defined  in  a  neighborhood 
of  zq,  such  that  da  =  uj.  Then  da  is  a  (2,  0)-form  that  satisfies 


dda  =  —dda  =  —duo  =  0. 


This  shows  that  da  is  actually  a  holomorphic  (2,0)-form. 

Since  also  dda  =  0,  we  see  that  da  is  closed,  which  means  that  there 
exists  a  holomorphic  1-form  77,  defined  in  a  possibly  smaller  neighborhood 
of  zo,  such  that  dp  =  dr]  =  da.  Thus,  d(a  —  rj)  =  0,  and  so  by  the  Dolbeault 
lemma,  there  exists  a  function  g,  defined  in  a  neighborhood  of  zo,  such  that 
dg  —  a  —  77.  Thus,  a  —  77  +  dg  and  so 


uj  =  da  =  ddg  =  —ddg 


since  drj  =  0.  The  function  n  :=  ig  then  satisfies  iddn  =  uj. 

Now,  a  calculation  in  coordinates  (Exercise  7)  shows  that  the  map  n  \-> 
iddf  is  real,  that  is,  it  maps  real-valued  functions  to  real-valued  2-forms. 
Since  uj  is  real,  the  operator  idd  must  map  the  imaginary  part  of  n  to  zero. 
Thus,  iddn  is  unchanged  if  n  is  replaced  by  its  real  part.  ■ 

Proof  of  Theorem  23.31.  Let  n  be  as  in  Lemma  23.34  and  let  6  be  the 
real- valued  1-form  given  by 


6  =  Iin(dn) 


1 

2 i 


Then  because  d2  =  d2  =  0,  we  have 


dO  =  {d  +  d)6 


ddn)  =  uj. 


(23.13) 
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That  is  to  say,  6  is  a  symplectic  potential  for  uj.  Thus,  by  Proposition  23.6, 
we  can  find  a  local  isometric  trivialization  sq  of  L  for  which  the  connection 
1-form  is  Q/h. 

For  any  vector  X,  we  have 

V*  (e-^^so)  =  (-Tx(k)  -  ljO(X)\  e~K/hs0,  (23.14) 

where  X(k)  =  dn{X)  =  dn(X)  +  <9r(X).  Now,  if  X  is  of  type  (0, 1),  then 
dn{X)  =  0,  in  which  case,  if  we  use  (23.13),  we  find  that  the  two  terms  on 
the  right-hand  side  of  (23.14)  cancel.  Thus,  e~^^2h^so  is  the  desired  local 
polarized  section.  ■ 
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In  this  section,  we  introduce  a  concept  known  as  half-forms ,  which  are 
designed  to  work  around  the  problem  that,  in  the  case  of  real  polarizations, 
there  often  do  not  exist  any  nonzero  square-integrable  polarized  sections. 

A  polarized  section  s  for  a  real  polarization  P  tends  to  have  infinite 
norm,  because  we  may  get  infinity  from  integrating  |s|2  along  the  leaves  of 
the  polarization.  To  illustrate  how  half-forms  work  around  this  problem, 
consider  the  case  of  the  vertical  polarization  on  M2  =  T*M.  Elements  of  the 
half-form  Hilbert  space  will  be  representable  in  the  form  s  (g)  ydr,  where  s 
is  a  polarized  section  of  L  and  where  \fdfx  will  be  interpreted  as  a  “section 
of  the  square  root  of  the  canonical  bundle.”  To  compute  the  norm  of  such 
an  object,  we  first  square  it  at  each  point  to  obtain  the  quantity  |s|2  dx. 
Since  s  is  polarized,  \s\2  is  a  function  of  x  only,  independent  of  p.  Thus, 
|s|2  dx  may  be  thought  of  as  a  1-form  on  R,  rather  than  on  M2,  which  we 
may  then  integrate  to  obtain 


(x)  dx. 


This  procedure  has  two  advantages  over  the  one  we  used  in  Sect.  22.4, 
where  we  simply  integrated  \s\  itself  over  R.  First,  a  version  of  this  proce¬ 
dure  works  for  real  polarizations  on  general  symplectic  manifolds.  Second, 
the  half- form  approach  will  allow  quantized  observables  to  be  self-adjoint, 
which  was  not  the  case  in  Sect.  22.5  when  we  simply  restricted  prequan¬ 
tized  observables  to  the  polarized  subspace.  (See  the  discussion  following 
Proposition  22.12.) 

Throughout  this  section,  we  assume  that  N  is  a  quantizable  symplectic 
manifold,  that  L  is  a  fixed  prequantum  line  bundle  over  X,  and  that  P  is 
a  fixed  purely  real  polarization  on  N. 
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23. 6. 1  The  Space  of  Leaves 

Recall  that  a  leaf  of  P  is  a  maximal  connected,  integral  submanifold  of 
P.  We  may  then  form  the  leaf  space  E  (the  set  of  all  leaves  of  P )  and  a 
quotient  map  q  :  N  E  sending  each  point  z  N  to  the  unique  leaf 
containing  z.  We  may  topologize  E  by  defining  a  set  U  in  E  to  be  open  if 
<7-1(Z 7)  is  open  in  N. 

In  order  to  be  able  to  carry  out  the  program  of  geometric  quantization 
with  respect  to  P,  we  must  assume  that  E  can  be  given  the  structure 
of  a  smooth,  n-dimensional  manifold  in  such  a  way  that  q  :  N  E  is 
smooth  and  such  that  the  kernel  of  q*,z  is  equal  to  Pf ,  the  intersection  of 
Pz  with  the  real  tangent  space  of  Pz.  We  abbreviate  this  assumption  on 
E  by  saying  that  E  is  a  smooth  manifold.  In  the  case  N  =  T*M  with  the 
vertical  polarization  (Example  23.17),  the  leaf  space  E  is  a  smooth  manifold 
diffeomorphic  to  M. 

It  should  be  emphasized  that  even  if  E  is  a  smooth  manifold,  there  is  no 
canonical  “volume  measure”  on  E.  Thus,  our  half-form  Hilbert  space  will 
be  defined  in  such  a  way  that  the  pointwise  “square”  of  an  element  will 
be  an  n-form,  rather  than  a  function,  on  the  leaf  space,  which  can  then  be 
integrated  over  the  n-manifold  E. 

23.6.2  The  Canonical  Bundle 

We  now  introduce  the  canonical  bundle  of  a  purely  real  polarization  P, 
with  sections  that  are  a  special  sort  of  n-form  on  TV,  along  with  a  notion 
of  polarized  section  of  the  canonical  bundle.  If  the  leaf  space  E  is  a  smooth 
manifold,  the  space  of  polarized  sections  of  the  canonical  bundle  can  be 
identified  with  the  space  of  all  n-forms  on  the  n-manifold  E. 

Definition  23.35  The  canonical  bundle  JCp  of  P  is  the  real  line  bundle 
with  sections  that  are  n-forms  a  having  the  property  that 


Xjo  =  0  (23.15) 

for  every  vector  field  X  lying  in  P.  A  section  a  of  JCp  is  polarized  if 

X4da)=0  (23.16) 

for  every  vector  field  X  lying  in  P. 

If  an  n-form  a  satisfies  (23.15),  then  a(Xi, . . .  ,Xn)  =  0  if  any  of  the 
Xj’s  belongs  to  P.  Thus,  the  value  of  a  at  any  point  z  can  be  viewed  as 
an  n-linear,  alternating  functional  on  the  quotient  vector  space  TZN/Pf , 
where  P ?  is  the  intersection  of  Pz  with  the  real  tangent  space.  Since  this 
quotient  space  is  n-dimensional,  we  see  that  at  each  point,  the  space  of 
possible  values  for  a  is  one  dimensional. 
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Meanwhile,  if  a  satisfies  (23.16),  then  at  each  point,  da  is  an  (n  +  1)- 
linear,  alternating  functional  on  TZN/Pf,  which  must  be  zero.  Thus,  for 
sections  of  /Cp,  (23.16)  is  equivalent  to  the  condition 

da  =  0.  (23.17) 

We  can  also  introduce  the  complexified  canonical  bundle  /Cp,  the  sections 
of  which  are  complex- valued  n-forms  satisfying  (23.15).  We  define  a  section 
of  /Cp  to  be  polarized  if  it  satisfies  (23.16). 

Example  23.36  Let  N  =  T*Mn=  R2n  and  let  P  be  the  vertical  polariza¬ 
tion  on  N.  Then  an  n-form  a  on  M2n  is  a  section  of  1C p  if  and  only  if  a 
is  of  the  form 

a  =  /(x,  p)  dxi  A  •  •  •  A  dxn ,  (23.18) 

and  a  is  a  polarized  section  of  1C p  if  and  only  if  a  is  of  the  form 

a  =  g(x)  dx\  A  •  •  •  A  dxn,  (23.19) 

for  smooth  functions  f  on  M2n  and  g  on  Mn. 

Proof.  If  a  contained  any  term  involving  dpj ,  the  contraction  of  a  with 
d/dpj  would  not  be  zero,  leaving  (23.18)  as  the  only  possible  form  for  a 
section  of  /Cp.  Assuming  a  is  of  the  form  (23.18),  if  /  is  not  independent 
of  p,  then  da  will  contain  a  nonzero  term  of  the  form  dpj  A  dx\  A  •  •  •  A  dxn , 
leaving  (23.19)  as  the  only  possible  form  for  a  polarized  section  of  /Cp.  ■ 

In  Example  23.36,  the  polarized  sections  of  /Cp  are  effectively  just  re¬ 
forms  on  the  configuration  space  Mn.  This  conclusion  is  a  special  case  of 
the  following  result. 

Proposition  23.37  If  the  leaf  space  E  of  P  is  a  smooth  manifold  and  a 
is  a  polarized  section  of  /Cp,  then  there  exists  a  unique  n-form  a  on  E  such 
that 

a  =  q*(a), 

where  q  :  N  -a  E  is  the  quotient  map.  Conversely,  if  f3  is  any  n-form  on  E, 
then  a  :=  q*(/3)  is  a  polarized  section  of  /Cp. 

Proof.  Suppose,  first,  that  a  =  </*(/?),  for  an  n-form  f3  on  E.  Then  X_\a  =  0 
whenever  X  lies  in  P,  since  P  is  the  kernel  of  q*.  Furthermore,  da  = 
q*(d/3)  =  0,  since  f3  is  an  n-form  on  an  n-manifold,  showing  that  a  is  a 
polarized  section  of  /Cp. 

In  the  other  direction,  we  have  already  noted  in  the  proof  of  Proposition 
23.26  that  N  can  be  identified  locally  with  a  neighborhood  U  x  V  of  the 
origin  Mn  x  Mn  in  such  a  way  that  leaves  of  P  correspond  to  the  sets  of  the 
form  {x}  x  V.  We  can  use  q  to  identify  U  =  U  x  {0}  with  an  open  set  U 
in  E.  Thus,  P  looks  locally  just  like  the  vertical  polarization  on  M2n,  and 
so,  by  Example  23.36,  any  polarized  section  a  of  /Cp  will  be  of  the  form 
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(23.19).  Thus,  a  determines  an  n-form  a  on  U  and  a  is  the  pullback  of 
a  by  the  projection  map  of  U  x  V  onto  U.  It  follows  that  a  is  locally  the 
pullback  by  q  of  an  n-form  a  on  U .  We  leave  it  to  the  reader  to  check  that 
overlapping  neighborhoods  in  N  give  the  same  form  a  on  S  and  that  the 
desired  result  holds  globally.  ■ 

Recall  from  Theorem  23.24  that  QpTe(f)  preserves  the  space  of  polarized 
sections  with  respect  to  P,  provided  that  the  flow  of  Xf  preserves  P  (which 
equals  P,  in  this  case).  We  now  establish  that  for  any  such  /,  the  Lie 
derivative  Cxf  preserves  the  space  of  polarized  sections  of  JCp.  This  result 
will  eventually  allow  us  to  define  a  quantum  operator  Q(f)  on  the  half- form 
Hilbert  space  associated  to  P. 

Proposition  23.38  Suppose  X  is  a  vector  field  on  N  that  preserves  P, 
in  the  sense  of  Definition  23.22,  and  suppose  a  is  a  smooth  section  of  JCp. 
Then  the  Lie  derivative  Cx&  is  another  section  of  JCp  and  if  a  is  polarized, 
Cx&  is  also  polarized. 


Proof.  Suppose  X\, . . . ,  Xn  are  smooth  vector  fields,  with  X\  lying  in 
P  =  P.  Then,  by  a  standard  formula  for  the  Lie  derivative, 


{£xol)(X\,  . . . ,  Xn) 

=  X(a(Xu...,Xn)) 


a([X,X1\,X2, 


n 


^  ^  o  (X i , . . . ,  X j — \ ,  \X ,  X j  ,  X j -pi,...,  Xfi ) 


i=2 


(23.20) 


Now,  because  a  is  a  section  of  JCp,  the  first  and  third  terms  on  the  right- 
hand  side  of  (23.20)  vanish.  Because  X  preserves  P,  [ X ,  X{\  will  again  he 
in  P,  and  so  the  second  term  vanishes  as  well.  Thus,  X\a (Cxot)  —  0,  which 
means  that  Cx&  is  again  a  section  of  JCp. 

Since  Cx&  =  Xj da  -j-  d(Xja),  if  a  satisfies  (23.17),  we  have 

d(CxOi)  =  d2(X  ja)  =  0, 


showing  that  a  is  again  polarized.  ■ 

Proposition  23.39  Suppose  the  leaf  space  S  of  P  is  a  smooth  manifold 
and  that  a  vector  field  X  on  N  preserves  P.  Then  there  exists  a  unique 
vector  field  Y  on  S  such  that 


q*,z(X)  =  Y  (23.21) 

for  all  z  G  N.  Furthermore,  if  a  =  q*(/3)  is  a  polarized  section  of  JCp,  as 
in  Proposition  23.37,  then 


Cx{q*(P))  =  q*(Cy(f3)). 


(23.22) 
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That  is  to  say,  under  the  identification  in  Proposition  23.37  of  polarized 
sections  of  JCp  with  n-forms  on  E,  the  operator  Cx  corresponds  to  the  Lie 
derivative  on  5  in  the  direction  of  Y. 

Proof.  By  Definition  23.22,  [X,  Z\  lies  in  P  whenever  the  vector  field  Z 
lies  in  P.  Thus,  if  a  function  <p  is  constant  along  P  (i.e.,  annihilated  by 
every  vector  field  Z  lying  in  P),  the  same  will  be  true  of  X(p.  Thus,  if  <p  is 
of  the  form  <p  =  ip  o  q  for  some  function  ip  on  E,  then  X<p  is  of  the  form 
ipoq  for  some  other  function  ip  on  E.  The  map  ip  ip  is  easily  seen  to  be  a 
vector  field,  that  is,  a  derivation  of  C°°(S).  We  conclude,  then,  that  there 
is  a  unique  vector  field  Y  on  E  such  that 

X pip  o  q)  =  (Yip)  o  q  (23.23) 

for  every  smooth  function  ip  on  E.  It  then  follows  from  the  definition  of  the 
differential  that  (23.21)  holds  for  all  z  E  N.  From  (23.21),  it  follows  easily 
that  for  any  n-form  / 3  on  E,  we  have 

Xj(q*m=q*(Y^).  (23.24) 

Since  /?,  being  a  top-degree  form,  is  closed,  q*(/3)  is  also  closed.  Thus,  one 
of  the  terms  in  the  formula  (21.7)  for  the  Lie  derivative  of  [3  and  q*(/3)  is 
zero.  Applying  d  to  both  sides  of  (23.24)  then  gives  (23.22).  ■ 

Given  a  vector  field  Y  and  a  nowhere- vanishing  n-form  j3  on  E,  let  div/3  Y 
be  the  unique  function  on  E  such  that 

Cy(/3)  =  (di  vpY)/3. 

Then  by  (23.22),  we  have 

£x{q*{P))  =  ((divg  Y)  o  q)q*(/3).  (23.25) 

The  expression  (23.25)  will  be  helpful  in  analyzing  the  quantization  of 
observables  in  Sect.  23.6.5. 

23.6.3  Square  Roots  of  the  Canonical  Bundle 

We  now  assume  that  the  leaf  space  E  of  P  is  an  orientable  manifold,  and 
we  choose  on  particular  orientation  of  E. 

Definition  23.40  Choose  a  no  where- vanishing ,  oriented  n-form  /?  on  E, 
so  that  a  :=  q*(/3)  is  (Proposition  23.37)  a  no  where- vanishing  section  of 
JCp.  A  section  of  JCp  is  non-negative  if  it  is,  at  each  point,  a  non-negative 
multiple  of  a.  This  notion  does  not  depend  on  the  choice  of  oriented  n-form 

(3. 


Since  E  is  orientable,  the  canonical  bundle  JCp  is  trivializable,  since  the 
section  a  in  Definition  23.40  is  a  globally  trivializing  section.  Thus,  we  can 
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find  a  square  root  of  /Cp,  that  is,  a  line  bundle  Sp  such  that  Sp  (g)  Sp  is 
isomorphic  to  /Cp.  (We  may,  for  example,  take  Sp  to  be  the  trivial  bundle.) 
When  we  speak  of  a  square  root  of  /Cp,  we  will  mean,  more  precisely,  a 
bundle  Sp  together  with  a  particular  isomorphism  of  Sp  (g)  Sp  with  /Cp. 
Thus,  if  s i  and  S2  are  sections  of  Sp,  we  think  of  s i  (g)  s 2  as  being  a  section 
of  /Cp.  We  assume,  further,  that  the  isomorphism  of  Sp  (g)  Sp  with  /Cp  is 
chosen  so  that  for  any  section  s  of  Sp,  the  section  s  (g)  s  of  /Cp  is  non¬ 
negative.  (If  the  initial  isomorphism  of  Sp<g)Sp  with  /Cp  does  not  have  this 
property,  compose  it  with  —I  in  the  fibers  of  /Cp.) 

We  may  consider  the  complexification  of  Sp,  that  is,  the  line  bundle  Sp 
whose  fiber  at  each  point  is  the  complexification  of  the  fiber  of  Sp.  There 
is  then  a  notion  of  complex  conjugation  for  sections  of  Sp,  which  fixes  the 
fiber  of  Sp  inside  the  fiber  of  Sp  at  each  point.  If  s  1  and  S2  are  sections  of 
Sp,  we  think  of  s  1  0  52  as  a  section  of  the  complexified  canonical  bundle 
JCcP. 

If  a  is  a  section  of  /Cp  and  X  is  a  vector  field  lying  in  P,  let  us  define  an 
n-form  Vja  by 


Vja  =  X  _\  (doS).  (23.26) 

Since  a  is  a  section  of  /Cp,  we  have  Xju  =  0,  which  means  that  Vja 
actually  coincides  with  by  (21.7).  Since  it  lies  in  P,  the  vector  field 

X  preserves  P,  and  thus  Vja  =  Cx&  is  again  a  section  of  /Cp,  by  Proposi¬ 
tion  23.38.  The  operator  V  in  (23.26)  has  all  the  properties  of  a  connection 
on  /Cp  except  that  it  is  only  defined  in  the  directions  of  P.  [Note  that  Cx 
does  not,  in  general,  satisfy  the  condition  Cfx  =  f£x,  as  required  by  Def¬ 
inition  23.2.  Since,  however,  Cx&  can  also  be  computed  as  in  (23.26),  for 
any  section  a  of  /Cp,  the  map  V  does  satisfy  X  fx  =  /Vj.] 

We  call  V  the  natural  partial  connection  on  /Cp.  According  to  Defini¬ 
tion  23.35,  a  section  a  of  /Cp  is  polarized  if  and  only  if  Vjft  =  0  for  each 
vector  field  X  lying  in  P.  We  now  show  that  both  the  partial  connection 
and  the  Lie  derivative  “descend”  to  sections  of  Sp  in  a  natural  way.  This 
result  will,  in  particular,  allow  us  to  define  a  notion  of  polarized  sections 
of  Sp. 


Proposition  23.41  Let  Sp  be  a  fixed  square  root  of  /Cp.  For  any  vector 
field  X  lying  in  P,  there  is  a  unique  linear  operator  X x  mapping  sections 
of  Sp  to  sections  of  Sp,  such  that 

Vx(M)  =  X{f)Sl  +  fXxS!  (23.27) 

Xx($i  0  S2)  =  (Vxm)  0  S2  +  s\  ®  ( V x^2 )  (23.28) 

for  all  smooth  functions  f  and  all  sections  s  1  and  S2  of  Sp.  On  the  left-hand 
side  of  (23.28),  X x  is  the  partial  connection  on  1C p  given  by  (23.26). 
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If  X  is  a  vector  field  on  N  that  preserves  P,  then  there  is  a  unique  linear 
operator  Cx ,  mapping  sections  of  5 p  to  sections  of  5 p  such  that 

£x(fsi)  =  X(f)si  +  fCxsi 
Cx{s\  0  s 2)  =  (Px^l)  0  S2  +  Si  (g)  (jCxS 2) 

for  all  smooth  functions  f  and  all  sections  s  1  and  S2  of  dp. 

Both  of  these  constructions  extend  naturally  from  sections  of  Sp  to  sec¬ 
tions  of  Sp . 

We  may  then  say  that  a  section  s  of  Sp  is  polarized  if  Vjs  =  0  for  every 
smooth  vector  field  X  lying  in  P. 

Proof.  If  V  is  a  one-dimensional  vector  space,  then  the  map  (g)  :  V  x  V 
V  is  commutative:  u®v  =  v<S>u  for  all  V.  Furthermore,  if  Uq  is  a 

nonzero  element  of  V,  then  the  map  u  i—y  u0Uq  is  an  invertible  linear  map 
of  U  to  V  0  V.  Suppose  s 0  is  a  local  nonvanishing  section  of  Sp.  Applying 
(23.28)  with  sq  =  ,$2  =  so,  we  want 

2(VxSo)  ®  so  =  Vx (so  ®  5o).  (23.29) 

Since  the  operation  of  tensoring  with  sq  is  invertible,  there  is  a  unique 
section  “V^So”  of  Sp  for  which  (23.29)  holds. 

Locally,  any  section  s  of  Sp  can  be  written  as  s  =  gso  for  a  unique 
function  g.  We  then  define  Vx5  by 

Vjs  =  X(g)so  +  gX  x$  o5  (23.30) 

in  which  case,  (23.27)  is  easily  seen  to  hold.  If  S\  =  gis$  and  S2  =  ^2^0, 
then  using  (23.29)  and  the  symmetry  of  the  tensor  product,  it  is  easy  to 
verify  that  (23.28)  holds,  with  both  sides  of  the  equation  equal  to 

X(gig2)Xx(s0  (g)  so). 

Uniqueness  of  Vx  holds  because  both  (23.29)  and  (23.30)  are  required 
by  the  definition  of  Vx<  The  action  of  Vx  extends  to  sections  of  Sp,  by 
writing  such  sections  as  complex- valued  functions  times  sq.  The  analysis  of 
the  Lie  derivative  is  similar  and  is  omitted.  ■ 

23.6.4  The  Half-Form  Hilbert  Space 

We  continue  to  assume  that  the  leaf  space  S  of  P  is  an  orientable  manifold, 
and  that  we  have  chosen  an  orientation  on  S.  We  assume  that  we  have 
chosen  a  square  root  Jp  of  /Cp,  as  in  Sect.  23.6.3.  If  L  is  a  prequantum  line 
bundle  over  TV,  we  now  form  the  tensor  product  bundle  L®  Sp.  Given  two 
sections  s  1  and  S2  of  L  (g)  Sp,  we  decompose  them  locally  as  Sj  =  fij  (g)  Uj, 
where  pj  is  a  section  of  L  and  zq  is  a  section  of  Sp,  and  where,  say,  the 
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/ij’s  are  taken  to  be  nonvanishing.  Then  we  can  combine  these  sections  to 
form  the  quantity 

(si,s2)  :=  (/ii,/i2)^T®  ^2,  (23.31) 

where  (/ii,  /i2)  is  the  pointwise  inner  product  given  by  the  Hermitian  struc¬ 
ture  on  L.  Since  (/ii,  /i2)  is  a  scalar- valued  function  and  V\  G  is  a  section 
of  /Cp,  the  quantity  (si,s2)  is  a  section  of  /Cp.  Any  other  decomposition 
of  Sj  as  the  tensor  product  of  a  nonvanishing  section  of  a  L  and  a  section 
of  SP  is  of  the  form  (f/aj)  G  (vj / /)  for  some  nonvanishing  function  /,  and 
the  value  of  (si,s2)  is  the  same  as  for  the  original  decomposition.  Since 
it  is  independent  of  the  choice  of  local  decomposition,  (si,s2)  is  actually 
defined  globally. 

Given  the  connection  on  L  and  the  partial  connection  (23.41)  on  dp,  we 
can  form  a  partial  connection  on  L  G  dp  with  the  following  property.  For 
any  vector  field  X  lying  in  P,  and  any  section  s  of  L  G  dp,  if  we  decompose 
s  locally  as  s  =  fi  G  v,  where  g  is  a  nonvanishing  section  of  L  and  v  is  a 
section  of  dp,  then 


Vj(s)  —  (Vxm)  G  v  +  /i  0  (Vx^).  (23.32) 

The  reader  may  verify  that  if  p,  G  v  is  replaced  by  (fp)  G  [y I  f)  for  some 
nonvanishing  function  /,  the  value  of  Vx(s)  is  unchanged.  Thus,  as  with 
the  quantity  (si,s2)  in  (23.31),  Vx(s)  is  defined  globally.  We  then  define 
a  section  8  of  L  G  dp  to  be  polarized  if  Vjs  =  0  for  each  vector  field  X 
lying  in  P.  If  s i  and  s2  are  polarized  sections  of  L  ®  dp,  then  the  section 
(s i,  s2)  in  (23.31)  is  easily  seen  to  be  a  polarized  section  of  /Cp. 

As  in  the  case  without  half-forms  there  is  an  obstruction  to  the  existence 
of  globally  defined  polarized  sections  of  LG  dp.  We  say  that  a  leaf  R  is  Bohr- 
Sommerfeld  (in  the  half-form  sense,  with  respect  to  a  particular  choice  of 
dp)  if  there  exists  a  nonzero  section  8  of  L  G  dp  defined  over  R  such  that 
Vjs  =  0  for  each  tangent  vector  to  R.  As  in  the  case  without  half-forms, 
if  the  leaves  are  topologically  nontrivial,  the  Bohr-Sommerfeld  leaves  will 
in  general  be  a  discrete  set  in  the  space  of  all  leaves. 

The  Bohr-Sommerfeld  leaves  in  the  half-form  sense  need  not  be  the  same 
as  the  Bohr-Sommerfeld  leaves  in  the  sense  of  Definition  23.27.  In  the 
setting  of  Example  23.29,  for  instance,  the  canonical  bundle  /Cp  is  trivial, 
but  the  square-root  bundle  dp  may  be  chosen  to  be  nontrivial,  by  putting 
in  a  twist  by  180  degrees  over  each  copy  of  S1.  (That  is  to  say,  we  think 
of  S'1  as  the  interval  [0,  2tt]  with  the  ends  identified,  and  we  attach  a  copy 
of  R  to  each  point.  But  when  identifying  the  fiber  at  2n  with  the  fiber  at 
0,  we  use  the  negative  of  the  identity  map.)  As  Exercise  9  shows,  in  this 
example,  the  Bohr-Sommerfeld  leaves  are  the  sets  of  the  form  { x }  x  S'1, 
where  x/h  =  n  1/2  for  some  integer  n. 

Definition  23.42  For  any  purely  real  polarization  P  and  any  square  root 
dp  o//Cp,  the  half- form  space  is  the  space  of  smooth,  polarized  sections 
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of  L  0  5p.  For  a  polarized  section  s  of  L  0  dp,  define  the  norm  of  s  by 


=  /  (s>st 


(23.33) 


where  (s,s)  is  as  in  (23.31)  and  where  (s,s)  is  the  n-form  on  E  given  by 
Proposition  23.37.  If  s i  and  S2  are  elements  of  the  half- form  space  with 
||,si||  <  oo  and  |p2 1|  <  oo,  define  the  inner  product  of  s\  and  S2  by 


(si,s2)  =  /  (si,s2)- 


The  half-form  Hilbert  space  is  the  completion  with  respect  to  the  norm 

2 

(23.33)  of  the  space  of  polarized  sections  s  for  which  ||s||  <  oo. 


The  integral  of  n-forms  on  E  is  taken  with  respect  to  the  chosen  orien¬ 
tation  on  E.  We  can  always  decompose  s  locally  as  s  =  /i  0  v  with  v  being 
a  section  of  Sp  (as  opposed  to  Sp)  and  p  being  a  section  of  L.  Then 

(s,s)  =  (n,ii)v  0  v, 

from  which  we  see  that  (s,  s)  is  a  non-negative  section  of  JCp  (Defini¬ 
tion  23.40).  (Recall  that  we  have  chosen  the  identification  of  Sp  0  Sp  with 
JCp  in  a  particular  way,  so  that  v  0  v  is  always  the  pullback  by  q  of  an 
oriented  form  on  E.)  Thus,  the  integral  on  the  right-hand  side  of  (23.33)  is 
non-negative,  but  possibly  infinite. 

Example  23.43  Let  N  =  T*M  =  M2  and  let  L  be  the  trivial  bundle  on 
N,  with  connection  Vj  =  X  —  ( i/K)0{X ),  where  0  =  p  dx.  Let  P  be  the 
vertical  polarization  on  N  and  orient  R  so  that  oriented  1  -forms  are  positive 
multiples  of  dx.  LetSp  to  be  the  trivial  bundle  and  with  a  trivializing  section 
“ Vdxv  of  Sp  such  that  \fdx  0  \fdix  =  dx.  Then  every  polarized  section  s  of 
L  0  Sp  has  the  form 

s  =  i)(x)  0  Vdx  (23.34) 

for  some  function  i)  on  R.  The  norm  of  such  a  section  is  computed  as 


s 


^(x)|2  dx. 


Proof.  The  sections  of  JCp  are  1-forms  that  are  zero  on  d/dp ,  that  is, 
1-forms  of  the  form  a  =  f(x,p)  dx.  Such  a  1-form  satisfies  da  =  0  if 
and  only  if  /  is  independent  of  p.  Thus,  dx  is  a  globally  defined  polarized 
section  of  JCp.  If  we  choose  Sp  to  be  trivial  and  let  Vdx  be  such  that 
Vdx^Vdx  =  dx,  then  Vdx  will  be  a  polarized  section  of  Sp.  Every  section 
s  of  L  0  Sp  can  be  written  uniquely  as  s  =  ^(x,p)  0  Vdx  for  some  function 
Q.  Since  Vdx  is  polarized  and  0(d/dp)  =  0,  we  see  that  s  is  polarized  if 
and  only  if  i)  is  independent  of  p.  For  a  section  of  the  form  (23.34),  we  have 

( 5,5 )  =  p(x)|2  dx,  in  which  case,  (s,  s)  is  given  by  the  same  formula  as 
(s,  s),  but  now  interpreted  as  a  1-form  on  E  =  R  rather  than  M2.  ■ 
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23.6.5  Quantization  of  Observables 

Suppose  /  is  a  function  on  N  for  which  Xf  preserves  P  in  the  sense  of 
Definition  23.22.  We  will  now  associate  with  /  a  self-adjoint  (or,  at  least, 
symmetric)  operator  Q(f)  on  the  half- form  Hilbert  space  of  P.  Operators 
of  this  sort  will  satisfy  exactly  the  desired  commutation  relations. 

Definition  23.44  For  any  function  f  on  N  for  which  Xf  preserves  P,  let 
Q(f)  be  the  operator  on  the  half-form  space  of  P  given  by 

Q(f)s  =  (Qpve(f)h)  ®  v  +  ih  pi  <g>  Cxfv, 

where  s  is  decomposed  locally  as  s  =  fi(B)v,  with  p  being  a  section  of  L  and 
v  a  section  of  5 p. 

The  operator  Q(f)  is  well  defined  (i.e.,  independent  of  the  choice  of  local 
trivialization)  as  may  easily  be  verified.  This  independence  holds,  however, 
only  because  the  coefficient  ih  of  X  xf  in  the  first  term  exactly  matches  the 
coefficient  ih  of  Cxf  hr  the  second  term. 

Before  describing  the  general  properties  of  the  operators  Q(/),  we  con¬ 
sider  a  simple  example  that  illustrates  the  essential  role  of  the  Lie  derivative 
term  in  Definition  23.44. 


Example  23.45  Let  the  notation  be  as  in  Example  23.43 and  let  f  :  M2 
R  be  of  the  form 

f(x,p)  =  a(x)  -f  b(x)p, 

for  some  smooth  functions  a  and  b  on  R.  Then  Xf  preserves  P  and 

Q(f)(pf(x)  0  Vdx )  =  fj(x)  (D  V dx , 


where 


1 


'ip(x)  =  —ih  (  b(x)iff(x)  H — br (x)f>(x)  )  +a(r)^(x) 


In  particular,  if  f(x,p)  =  x,  then  fj(x)  =  xif(x)  and  if  f(x,p)  =  p,  then 
f>(x)  =  —ih  dfj/dx.  More  generally,  if  a  and  b  are  polynomials,  then  the 
action  of  Q(f)  on  fj  coincides  with  the  Weyl  quantization  of  /  (Exercise  8 
in  Chap.  13). 

The  term  involving  b'(x)  comes  from  the  presence  of  half- forms  and  is 
absent  in  the  formula  (22.15)  for  Qpre(/).  The  b'  term,  with  the  exact 
coefficient  of  1/2,  is  necessary  for  Q(f)  to  be  self-adjoint  (or,  at  least, 
symmetric);  see  Exercise  10.  Example  23.45  is  actually  quite  representative 
of  the  general  case.  [Compare  (23.38)  in  the  proof  of  Theorem  23.47  and 
Example  23.48.] 

Proof.  We  have  computed  Qpre(/)  in  (22.15)  in  the  proof  of  Proposi¬ 
tion  22.12.  We  compute  that  Xf  is  equal  to  —b(x)  d/dx  plus  a  term  in¬ 
volving  d/dp.  Since  the  1-form  dx  is  closed,  we  obtain,  by  (21.7), 

Cxf{dx)  =  d(Xfjdx)  =  —db(x)  =  —b\x)  dx. 
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Using  Proposition  23.41,  we  then  obtain 


1  1 

Cxf  (^/dx^j  ®  Vdx  =  —  ~b'(x)  dx  =  —  ~b'(x)V dx  ®  \fdx, 


(23.35) 


which  gives 


Adding  the  Cxf  term  to  the  previously  computed  expression  for  Qpre(/) 
gives  the  desired  result.  ■ 

Returning  now  to  the  setting  of  general  real  polarizations,  we  establish 
two  key  results  for  the  quantized  observables  Q(/),  that  they  satisfy  the 
desired  commutation  relations  and  that  they  are  self-adjoint  (or,  at  least, 
symmetric)  whenever  /  is  real  valued.  It  can  also  be  shown  that  when  /  is 
a  polarized  function  (i.e. ,  constant  along  each  leaf  of  P),  then  Q(f)  acts  on 
the  quantum  Hilbert  space  simply  as  multiplication  by  /.  See  Exercise  11. 


Theorem  23.46  Suppose  f  and  g  are  functions  on  N  for  which  Xf  and 
Xg  preserve  P.  Then  the  operators  Q(f)  and  Q(g)  satisfy 

d[Q(f),Q(g)]  =  Q({f,g}) 
on  the  space  of  smooth,  polarized  sections  of  L  ®  Sp. 

Proof.  Since  Q(h)  is  a  local  operator  for  any  function  h,  it  suffices  to  prove 
the  result  locally.  Let  us  choose,  then,  a  local  nonvanishing  section  v$  of 
so  that,  locally,  each  section  s  of  L<S)Sp  can  be  decomposed  uniquely  as 
s  =  o.  For  any  vector  field  preserving  P,  we  let  7(A)  be  the  function 
such  that 

=  j(X)v0. 

We  then  have  Q(f)(g  ®  z^o)  =  jd  0  n 0,  where 

ft  [Qpre(/)  "T  ih'y^Xffg. 

We  now  compute  that 


[Qpre(/)+ift7(*/)  ?  Q  pre  C g )  +  ihi  (Xg)} 

=  [Qpre(f)  1  Qpre  (9)]  ih[Qpre(f),/y(Xg)\  +  ih[zf(Xg),  Qpre(/)] 

=  iftQprett/,  </})  +  m2  (XfMXg))  -  XgMXf)))  . 


The  desired  result  will  follow  if  we  can  verify  that 


Xfh(Xg))  -  Xg^(Xf))  =  7PW-  (23.36) 

To  verify  (23.36),  we  use  a  standard  identity  for  the  Lie  derivative  on 
forms:  C[x,y]  =  [£xi  £y}-  Using  Proposition  23.41,  we  can  easily  show  that 
this  identity  holds  also  on  sections  of  Sp,  for  vector  fields  that  preserve  P. 
It  is  then  a  simple  calculation  (Exercise  12)  to  verify  (23.36).  ■ 
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Theorem  23.47  If  f  E  C°°(N )  is  real  valued  and  Xf  preserves  P,  then 
the  operator  Q(f)  is  symmetric  on  the  space  of  smooth  sections  s  in  the 

half- form  space  for  which  ( s ,  s)  has  compact  support  on  S. 

Proof.  Suppose  a  =  q*(/3)  is  polarized  section  of  /Cp,  so  that  there  is, 
at  least  locally,  a  corresponding  polarized  section  y/q*(/3)  of  Sp.  If  Xf 
preserves  P,  then  by  Proposition  23.39,  there  is  a  unique  vector  field  Yf  on  S 
such  that  q*iZ(Xf)  =  Yf  for  all  z  G  N.  Using  (23.25)  and  Proposition  23.41, 
we  get 

£xf  (U*M)  =  l((di vpYf)oq)fq*((3). 

Meanwhile,  it  is  not  hard  to  show  (Exercise  13)  that  it  is  possible  to 
choose  a  local  symplectic  potential  6  that  is  zero  in  the  directions  of  P. 
Thus,  we  can  trivialize  L  locally  in  such  a  way  that  sections  that  are  co- 
variantly  constant  along  P  are  simply  functions  that  are  constant  along  P 
in  the  ordinary  sense.  Thus,  elements  s  of  the  half- form  space  have,  locally, 

the  form  _ 

s  =  ('ip  o  q)  <S)  ^ q*((3 )  (23.37) 

for  some  function  ip  and  n-form  (3  on  E.  Thus,  if  Xf  preserves  P,  and  a 
section  s  is  decomposed  locally  as  in  (23.37),  we  have 

Q(/)(s)  =  (V>  o  ?)  ®  vV(/3), 

where 

4>  =  ih  [Yfif,  +  |  (diV/3  Yf)Wj  +  (-e(Xf)  -  m.  (23.38) 

It  can  be  verified  (Exercise  14)  that  the  function  —0(Xf)  —  f  is  constant 
along  P  and  thus  may  be  thought  of  as  a  function  on  5. 

By  multiplying  elements  of  the  half-form  space  by  functions  of  the  form 
yo g,  with  y  having  compact  support  in  S,  we  can  “localize”  the  calculations 
on  S.  Suppose  s i  and  S2  are  two  elements  of  the  half-form  space  decomposed 
as  in  (23.37)  near  a  point  z  G  iV,  with  the  same  /?  and  two  different  functions 

ipi  and  ip 2  on  E.  Then  (si,  S2)  has  the  form  ipiip2fi  hr  a  neighborhood  U  of 

q(z).  By  localization,  we  may  assume  that  ($1,52)  has  compact  support  in 
£7,  and  we  then  have 


(si,Q(f)s2)  =  -ih  /  ■01'02  /3, 

where  ip2  is  as  in  (23.38).  “Integration  by  parts”  (Exercise  15)  with  respect 
to  f3  then  shows  that  this  quantity  coincides  with  (Q(f)s  1, 52)  .  ■ 

Example  23.48  (Cotangent  Bundles)  Let  N  =  T*M  for  an  oriented 
manifold  M ,  let  0  be  the  canonical  1-form  on  N ,  and  let  L  be  the  trivial 
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line  bundle  on  TV,  with  connection  Vx  =  X  —  ( i/h)0(X ).  Let  P  be  the 
vertical  polarization  on  N ,  so  that  ICp  is  trivial ,  and  let  Sp  be  chosen  to 
be  trivial.  Let  (3  be  an  arbitrary  nowhere- vanishing,  oriented  n-form  on  M, 
so  that  a  :=  i r*(/3)  is  a  no  where- vanishing  section  of  ICp,  and  choose  a 
trivializing  section  y/ a  of  Sp  with  y/ a  ®  y/ a  =  a.  In  that  case,  elements  s 
of  the  half-form  Hilbert  space  have  the  form  s  =  (fj  o  p)  ®  y/a,  where  fj  is 
a  function  on  M,  and 


•01  /?■ 


J  M 

The  half- form  Hilbert  space  may,  thus,  be  identified  with  L2(M,(3). 

Suppose  now  that  f  is  a  function  on  T*M  of  the  form  f  =  fi  +  /2,  where 
fi  is  constant  on  each  fiber  of  T*M  and  f^  is  linear  on  each  fiber.  Then 
f  2  may  be  thought  of  as  a  section  ofT**M  =  TM,  that  is,  as  a  vector  field 
Yf  on  M.  In  that  case,  Xf  preserves  P  and  Q(f)  acts  on  elements  of  the 
half-forms  space  as 


Q(f)  (00  0  7 T)  ®  \fa)  =  (fj  O  7r)  0  \/~Oi , 


where 


1 


=  ih[  Yfif  +  -(div/3  Yf)ip  +  fifj. 


Here  div/3  Yf  is  the  unique  function  such  that  CyfP  =  (div^  Yf)/3. 


A  simple  calculation  in  coordinates  shows  that  the  vector  field  Yf  in  the 
example  satisfies  Xftpjjop)  =  ( Yfip )  o  tt,  so  that  our  notation  is  consistent 
with  that  in  Proposition  23.39  [see  (23.23)]. 

Proof.  The  calculation  is  precisely  the  same  as  in  the  proof  of  Theorem 
23.47,  except  that  the  decomposition  in  (23.37)  is  now  global.  The  claimed 
form  of  Q(f)  is  nothing  but  the  expression  (23.38),  where  the  reader  may 
easily  compute,  using  local  coordinates,  that  —  0(Xf)  —  f  =  fi.  ■ 

It  is  an  unfortunate  feature  of  geometric  quantization  that  in  the  case 
of  the  vertical  polarization  on  cotangent  bundles,  it  only  permits  us  to 
quantize  functions  that  are  at  most  linear  in  the  momentum  variables.  In 
a  typical  physical  system  having  T*M  as  its  phase  space,  there  will  be  a 
“kinetic  energy”  term  in  the  classical  Hamiltonian  that  is  quadratic  in  p. 
To  quantize  such  a  system,  one  has  to  find  a  way  to  quantize  the  kinetic 
energy  term,  “by  hook  or  by  crook.” 

One  approach  to  this  problem  is  to  allow  the  exponentiated  quantized 
Hamiltonian  to  change  the  polarization,  and  then  to  use  pairing  maps 
(Sect.  23.8)  to  “project”  back  to  the  Hilbert  space  for  the  original  polar¬ 
ization.  As  explained  in  Sect.  9.7  of  [45],  this  approach  succeeds  in  the 
case  that  the  kinetic  energy  term  is  g(p,p) / (2m),  where  g  is  the  Rieman- 
nian  structure  on  T*M  induced  by  a  Riemannian  structure  on  TM.  The 
quantized  kinetic  energy  operator  turns  out  to  be  given  by  the  map 


h2 


2m 


(A ip)(x)  -  \r(x)i])(x) 

o 


(23.39) 


518 


23.  Geometric  Quantization  on  Manifolds 


where  A  is  the  Laplacian  for  M  (taken  to  be  a  negative  operator)  and 
where  R(x)  is  the  scalar  curvature  of  the  Riemannian  structure  on  TM. 
The  calculation  in  [45]  glosses  over  one  technical  issue,  which  is  that  the 
time-evolved  polarizations  may  not  be  everywhere  transverse  to  the  original 
polarization.  Nevertheless,  the  calculation  provides  a  reasonable  geometric 
motivation  for  the  formula  (23.39). 

It  should  be  emphasized  that,  because  of  the  projections  involved  in 
the  computation  of  the  quantized  kinetic  energy  operator,  it  does  not  sat¬ 
isfy  the  desired  commutation  relations  with  the  quantizations  of  functions 
whose  flow  preserves  the  vertical  polarization.  Nevertheless,  this  approach 
to  quantizing  the  kinetic  energy  may  simply  be  the  best  one  can  do. 


23.7  Quantization  with  Half-Forms:  The 
Complex  Case 

In  the  case  of  a  purely  complex  polarization,  half-forms  are  not  “neces¬ 
sary,”  in  that  we  typically  have  a  nonzero  Hilbert  space  even  without  them. 
Nevertheless,  their  inclusion  gives  advantages.  In  the  first  place,  using  half¬ 
forms  makes  the  complex  case  more  parallel  to  the  real  case.  In  the  second 
place,  complex  quantization  with  half-forms  simply  gives  better  results  than 
without  half-forms.  In  the  case  of  the  harmonic  oscillator,  for  example,  the 
inclusion  of  half- forms  allows  (Example  23.53)  geometric  quantization  to 
reproduce  precisely  the  spectrum  (n+l/2)do;,  n  =  0, 1,  2, ... ,  that  we  found 
in  the  traditional  treatment.  This  result  should  be  compared  to  Proposition 
22.14  without  half- forms,  where  the  spectrum  is  found  to  be  nhuj. 

Throughout  this  section,  we  assume  that  (TV,  ca)  is  a  2n-dimensional 
quantizable  symplectic  manifold,  that  (L,  V)  is  prequantum  line  bundle 
over  A,  and  that  P  is  a  Kahler  polarization  on  N  (Definition  23.19).  Since 
the  definitions  in  the  complex  case  are  very  similar  to  those  in  the  real 
case  (with  a  few  important  differences),  we  will  run  through  them  quickly. 
Since  P  is  no  longer  equal  to  P,  we  need  to  replace  P  by  P  in  may  of  the 
formulas  from  Sect.  23.6. 

The  canonical  bundle  /Cp  of  P  is  the  complex  line  bundle  for  which  the 
sections  are  n-forms  a  satisfying 


Xja 

for  each  vector  field  X  lying  in  P.  Sections  of  /Cp  are  precisely  the  (n,  0)- 
forms  on  N.  A  section  of  /Cp  is  said  to  be  polarized  if 

Ij  (da)  =  0  (23.40) 

for  every  vector  field  lying  in  P,  or,  equivalently,  if  da  =  0.  Polarized 
sections  of  /Cp  are  precisely  the  holomorphic  (n,  0)-forms  on  N.  By  a  square 
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root  of  JCp  we  will  mean  a  complex  line  bundle  S p  over  N  such  that  SpCPSp 
is  isomorphic  with  /Cp,  together  with  a  particular  isomorphism  of  Sp  CP  Sp 
with  JCp.  Thus,  if  s i  and  S2  are  sections  of  5p,  we  think  of  s i  (g)  <§2  as  being 
a  section  of  JCp.  We  assume  that  such  a  square  root  exists  and  we  fix  for 
the  remainder  of  this  section  one  particular  square  root  Sp. 

If  X  is  a  vector  field  that  preserves  P,  in  the  sense  of  Definition  23.22, 
then  jCx  preserves  the  space  of  sections  of  JCp  and  also  the  space  of  po¬ 
larized  sections  of  JCp.  The  condition  (23.40)  defining  polarized  sections  of 
JCp  can  be  understood  as  the  vanishing  of  a  partial  connection  V.,  defined 
for  vector  fields  lying  in  P,  and  given  by  Vja  =  Xj (da).  Both  the  partial 
connection  (for  vector  fields  lying  in  P)  and  the  Lie  derivative  (for  vector 
fields  preserving  P)  descend  from  JCp  to  Sp ,  as  in  Proposition  23.41  in  the 
real  case.  The  connection  on  L  and  the  partial  connection  on  Sp  combine 
to  give  a  partial  connection  on  L  CP  Sp.  A  section  s  of  L  CP  Sp  is  said  to  be 
polarized  if  X  xs  =  0  for  all  vector  fields  X  lying  in  P. 

Notation  23.49  If  (3  is  any  2n-form  on  N ,  let  the  expression 

P 

A 

denote  the  unique  function  on  N  such  that  [3  =  (/?/ A) A,  where  A  is  the 
Liouville  form  in  Definition  21.16. 

Unlike  the  canonical  bundle  in  the  real  case,  the  canonical  bundle  in  the 
purely  complex  case  carries  a  natural  Hermitian  structure. 

Proposition  23.50  If  a  is  an  (n,  0)-form  on  N ,  then  at  each  point  the 
2n-form 

(_l)"(n— : i)/2(_i)«  aAa 

is  a  non-negative  multiple  of  the  Liouville  form  X.  There  is  then  a  unique 
Hermitian  structure  on  Sp  with  the  property  that  for  each  section  s  of  Sp 
we  have 


1  /  2 

2  (  (  —  l)ra(»"4)/2(_ (s  (g)  s)  a  (s  0  s)  \ 

5  =  V  ^  A  ) 

The  factor  of  2n  in  the  denominator  in  (23.41)  is  inserted  for  convenience, 
to  make  certain  formulas  come  out  more  nicely. 

Proof.  See  Exercise  17.  ■ 

Since,  by  assumption,  there  is  Hermitian  structure  on  L,  the  above  Her¬ 
mitian  structure  on  S p  gives  rise  in  a  natural  way  to  a  Hermitian  structure 
on  L  CP  Sp. 


Definition  23.51  The  half-form  Hilbert  space  for  a  K abler  polariza¬ 
tion  P  on  N  is  the  space  of  square-integrable  polarized  sections  of  L  CP  Sp. 
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In  the  Cn  case,  using  the  canonical  1-form  as  our  symplectic  potential, 
elements  of  the  half-form  Hilbert  space  take  the  form 

e— |Imzj2/(2 (g,  y/dz1  A  •  •  •  A  dzn. 

In  this  special  case,  the  norm  of  the  half-form  factor  y/ ~dz[  A  •  •  •  A  dzn  is 
constant  and  the  half-form  Hilbert  space  is  still  identifiable  with  the  space 
in  Conclusion  22.10.  In  the  case  of  the  unit  disk,  on  the  other  hand,  the 
presence  of  half- forms  alters  the  inner  product;  see  Exercise  16. 

We  now  define  quantized  observables  on  the  half- form  Hilbert  space, 
using  the  same  formula  as  in  the  real  case. 

Definition  23.52  If  f  is  a  function  on  N  for  which  Xf  preserves  P,  let 
Q(f)  be  the  operator  on  the  half-form  Hilbert  space  of  P  given  by 

Q(f)s  =  (Qpre(f)n)  ®V~ih  H  <g>  Cx,V, 

where  s  is  decomposed  locally  as  s  =  /i  ®  v,  with  p  being  a  section  of  L  and 
v  a  section  of  dp. 

These  operators  satisfy  [Q(f),Q(g)\ /(iti)  =  Q({f,g})  on  the  space  of 
smooth  polarized  sections  of  L  ®  Sp,  with  the  proof  of  this  result  being 
identical  to  the  proof  of  Theorem  23.46  in  the  real  case.  If  /  is  real-valued 
and  Xf  preserves  P,  then  Q(f)  will  be  at  least  symmetric,  assuming  we  can 
find  a  dense  subspace  of  the  half-form  Hilbert  space  consisting  of  “nice” 
functions.  (Finding  dense  subspaces  is  more  difficult  in  the  holomorphic 
case  than  in  the  real  case.)  A  proof  of  this  claim  is  sketched  in  Exercise  18. 

Example  23.53  Consider  M2  =  T*M  with  the  K abler  polarization  P  given 
by  the  global  complex  coordinate  z  =  (x  —  ip /(mu)),  for  some  positive 
number  u.  Take  Sp  to  be  trivial  with  trivializing  section  \fdz.  Consider 
also  the  harmonic  oscillator  Hamiltonian  H  :=  (p2  +  (mux)2) / (2m) .  Then 
Xp  preserves  the  P  and  the  operator  Q(H)  on  the  half- form  Hilbert  space 
has  spectrum  consisting  of  numbers  of  the  form  (n  +  1/2 )hu,  where  n  = 
0,1,2,.... 

In  this  example,  u  is  the  frequency  of  the  oscillator  and  not  the  canonical 
2- form. 

Proof.  The  calculation  is  the  same  as  in  the  proof  of  Proposition  22.14, 
except  for  the  addition  of  the  Lie  derivative  term.  A  simple  calculation 
shows  that  jCxH(dz)  =  iu  dz ,  from  which  it  follows  that  CxH^~^  = 
(iu /2)\fdz.  It  is  then  easy  to  see  that  the  set  of  elements  of  the  form 
e—mu\im  z\  /{2h)zn  ^  form  an  orthonormal  basis  of  eigenvectors  for 
Q(H ),  with  eigenvalues  (n  +  1/2) hu.  m 
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Pairing  maps  are  designed  to  allow  us  to  compare  the  results  of  quantizing 
with  respect  to  two  different  polarizations.  We  consider  mainly  the  case 
of  two  “transverse”  real  polarizations;  the  case  of  two  complex  polariza¬ 
tions  or  one  real  and  one  complex  polarization  can  be  treated  with  minor 
modifications. 

Suppose  that  P  and  P'  are  two  purely  real  polarizations  and  that  the 
associated  leaf  spaces  Si  and  S2  are  oriented  manifolds.  Suppose  also  that 
P  and  P'  are  transverse  at  each  point  z  £  Af,  meaning  that  Pz  n  K  = 
{0}.  If  a  and  [3  are  polarized  sections  of  ICp  and  Kp> ,  respectively,  the 
transversality  assumption  is  easily  shown  to  imply  that  a  A  /?  is  a  nowhere- 
vanishing  2  n- form  on  N.  Thus,  for  any  point  zGiV,  we  can  define  a  bilinear 
“pairing”  from  SPz  x  Sp'^z  -£  M  by 


(>1,^2) 


{y\  (8>  v\)  A  [y2  ®  v 2)  \ 


A 


(23.42) 


(Recall  Notation  23.49.)  We  can  extend  this  pairing  to  a  pairing  Spz  x 
dp,  z  C  that  is  conjugate  linear  in  the  first  factor  and  linear  in  the  second 
factor.  Finally,  we  extend  to  a  pairing  of  [Lz  ®  x  (Lz  ®  fip'z)  ^  C  by 
setting  (/i i0i/i,  112^2)  equal  to  (/xi ,  /a 2X^1,  ^2),  where  (/xi ,  /X2)  is  computed 
with  respect  to  the  Hermit ian  structure  on  L. 

Let  Hi  and  H2  denote  the  half- form  Hilbert  spaces  for  P  and  P',  re¬ 
spectively.  Given  s\  £  Hi  and  S2  £  H2,  we  define  the  pairing  of  s\  and 
s2  by 

(si,  £>2) p  p,  ‘=  c  /  (<§i,  S2)  A, 

J  N 

provided  that  the  integral  is  absolutely  convergent.  Here  (^1,82)  is  the 
pointwise  pairing  of  s  1  and  s 2  defined  in  the  previous  paragraph  and  c  is 
a  certain  “universal”  constant,  depending  only  on  h  and  the  dimension  of 
n,  that  can  be  chosen  to  make  certain  examples  work  out  nicely.  We  now 
look  for  a  pairing  map  Ap^p/  :  Hi  -£  H2  with  the  property  that 


(<M,  S2)  p  pi  —  (Ap?p/Si,  82) 


He 


(23.43) 


If  the  pairing  is  bounded  (i.e.,  it  satisfies  |(8i,82 


P,P' 


<  C  81 II  II 82 II  for 


some  constant  (7),  there  is  a  unique  bounded  operator  Agp/  satisfying 
(23.43).  Even  if  the  pairing  is  unbounded,  we  may  be  able  to  define  A p5p/ 
as  an  unbounded  operator. 

If  we  were  optimistic,  we  might  hope  that  the  pairing  map  for  any  two 
transverse  polarizations  would  be  unitary,  or  at  least  a  constant  multiple 
of  a  unitary  map.  If  this  were  the  case,  it  would  suggest  that  quantization 
is  independent  of  the  choice  of  polarization,  in  the  sense  that  there  would 
be  a  natural  unitary  map  between  the  Hilbert  spaces  for  two  different 
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polarizations.  As  it  turns  out,  however,  the  typical  pairing  map  is  not  a 
constant  multiple  of  a  unitary  map.  Nevertheless,  there  are  certain  special 
cases  where  the  pairing  map  is  unitary  (up  to  a  constant),  including  the  case 
of  translation-invariant  polarizations  on  M2n.  See  also  [20]  for  an  example  of 
a  pairing  map  between  a  real  and  a  complex  polarization  that  is  a  constant 
multiple  of  a  unitary  map. 

We  compute  just  one  very  special  case  of  the  pairing  map  between  two 
real  polarizations. 

Example  23.54  Consider  N  =  M2  =  T*M  and  take  L  to  be  trivial  with 
connection  1-form  0  =  p  dx.  Let  P  be  the  vertical  polarization,  spanned  at 
each  point  by  8/ dp,  and  let  P'  be  the  horizontal  polarization,  spanned  at 
each  point  by  d/dx.  Then  elements  s\  of  the  half-form  space  for  P  have  the 
form 

s\(x,p)  =  <f(x)  (g)  Vdx  (23.44) 

and  elements  S2  of  the  half-form  space  for  P'  have  the  form 

S2{x,p)  =  if(p)elxp^h  ®  \fdp,  (23.45) 

where  <f  and  if  are  functions  on  M.  If  c  =  1,  the  pairing  is  computed  as 

(. si,S2)pp/  =  —  [  <f(x)if(p)elxp^h  dx  dp.  (23.46) 

’  Jr 2 

If  s i  has  the  form  (23.44),  then  Apjpt(si)  has  the  form  (23.45),  where 

if(p)  =  —  f  <f(x)e~lxp^h  dx. 

Jr 

Thus,  A p5p/  is  a  scaled  version  of  the  Fourier  transform  and  is,  in  partic¬ 
ular,  a  constant  multiple  of  a  unitary  map. 

The  pairing  should  be  defined  initially  on  some  dense  subspace  of  the 
Hilbert  spaces,  such  as  the  subspaces  where  <f  and  if  are  Schwartz  func¬ 
tions.  The  pairing  map  can  also  be  defined  initially  on  the  Schwartz  space, 
recognized  as  being  unitary  (up  to  a  constant),  and  then  extended  by  con¬ 
tinuity  to  all  of  Hi.  Once  the  pairing  map  is  extended  to  Hi,  the  pairing 
itself  can  be  defined  for  all  s i  G  Hi  and  S2  G  H2  by  taking  (23.43)  as  the 
definition  of  (si,S2)Ppf  .  Even  though  it  is  possible,  as  just  described,  to 
extend  the  pairing  to  all  of  Hi  x  H2,  the  integral  in  (23.46)  is  not  always 
absolutely  convergent. 

Proof.  The  forms  (23.44)  and  (23.45)  are  obtained  by  a  simple  modification 
of  the  argument  in  the  proof  of  Proposition  22.8.  We  can  compute  that  the 
pointwise  pairing  of  vdx  and  \fdfp  is  —1,  which  gives  the  indicated  form  of 
the  pairing  in  (23.46).  The  pairing  may  be  rewritten  as 


e  lxP/h  dx  if(p)  dp, 


which  gives  the  indicated  form  of  the  pairing  map.  ■ 
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23.9  Exercises 

1.  Let  L  be  a  line  bundle  with  connection  V  over  N.  Let  s  be  a  section  of  L 
and  let  X\  and  X 2  be  two  vector  fields  on  N  such  that  X\ (z)  =  X2(z) 
for  some  fixed  point  z  G  N.  Show  that 

Vxi(s)W  =  Vx2(s)(z). 

Hint :  Use  the  assumption  that  V/x  =  /Vj. 

2.  Let  L  be  a  Hermit ian  line  bundle  with  Hermit ian  connection  V  and 
let  so  be  a  locally  defined  section  of  L  such  that  (so,  so)  =  1-  Given  a 
vector  field  X ,  let  0(X)  be  the  unique  function  such  that 

Vx^o  =  —iO(X)sQ. 

Show  that  0(X)  is  real  valued. 

Hint :  Use  the  Hermitian  property  of  the  connection. 

3.  Consider  the  definition  of  the  curvature  2-form  cz(X,  Y)  in  Defini¬ 
tion  23.4. 

(a)  Show  that  the  expression  for  uj  is  C°°-linear  in  each  of  the  vari¬ 
ables  X,  Y,  and  s.  That  is  to  say,  show  that  for  all  smooth 
functions  /,  we  have  cu(fX1Y)s  =  /u;(X,  Y)s,  and  similarly  for 
the  variables  Y  and  s. 

(b)  Show  that  the  value  of  uj(X,Y)s  at  a  point  z  depends  only  on 
the  values  of  X,  Y,  and  s  at  the  point  z. 

(c)  Show  that  the  value  of  cz(X,  Y)  at  a  point  z  does  not  depend  on 
the  value  of  s  at  z,  provided  that  s(z)  7^  0. 

4.  Consider  the  symplectic  form  ca  =  dpAdx  on  M2.  Define  a  purely  com¬ 
plex  polarization  on  M2  by  taking  Pz  to  be  the  span  of  the  vector  d/dz 
in  (22.9),  for  some  fixed  a  >  0.  Show  that  P  is  a  Kahler  polarization. 

5.  Let  P  be  the  polarization  on  M2  in  Exercise  4.  Show  that  the  function 
K,(x,p)  :=  ap2  is  a  Kahler  potential  for  P. 

6.  Suppose  that  uj  is  a  J-invariant  2-form  on  a  complex  manifold  N.  Show 
that  uj  is  a  (1,  l)-form.  (Recall  the  definitions  preceding  Lemma  23.34.) 

Hint :  Write  lj  =  cj1  Ycj2,  where  cj1  is  a  (1,  l)-form  and  cj2  is  a  sum  of 
a  (2,0)-form  and  a  (0,2)-form.  Show  that 

cj2(JX,  JY)  =  -cj2(X,Y) 

for  all  tangent  vectors  X  and  Y. 
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7.  Suppose  that  k  is  a  smooth,  real-valued  function  on  a  complex  mani¬ 
fold  N.  Show  that  the  2-form  iddn  is  a  real-valued  2-form. 


8.  In  Example  23.30,  verify  that  6  is  a  symplectic  potential  for  cj,  and 
compute  0(d/dz),  where,  with  z  =  x  —  iy,  we  have  d/dz  =  (d/dx  — 
id/dy)/2.  Then  verify  that  so(z)  :=  (1  —  {zf)1^  satisfies  X  q/QzS  o  =  0 
and  thus  constitutes  a  global  trivializing  holomorphic  section. 

9.  Consider  the  situation  in  Example  23.29.  Show  that  the  canonical  bun¬ 
dle  for  P  is  trivial,  with  trivializing  section  dx.  Let  Sp  be  the  (non¬ 
trivial)  bundle  described  in  the  paragraph  preceding  Definition  23.42. 
Since  the  tensor  product  of  any  real  line  bundle  with  itself  is  trivial, 
Sp  ®5p  is  isomorphic  to  Kp.  Let  \fdcx  denote  a  discontinuous  section 
defined  over  the  set  0  <  <p  <  2n  such  that  Vdx^y/dx  =  dx.  Show  that 
V x{dx)  =  0  and  XxVdx  =  0  for  every  vector  field  lying  in  P.  Now 
show  that  the  Bohr-Sommerfeld  leaves  (in  the  half-form  sense,  for  this 
choice  of  SP)  are  the  sets  of  the  form  {x}  x  S'1,  where  x/h  =  n  -\-  1/2 
for  some  integer  n. 

10.  Let  b  be  a  smooth,  real-valued  function  on  R  and  let  c  be  a  real 
constant.  Show  that  an  operator  of  the  form 


^  i— >>  —ih  (b(x)ip'(x)  +  cb' (x)ip(x)) 


is  symmetric  on  C£°(R)  C  L2(M)  if  and  only  if  c  =  1/2. 


11.  Let  P  be  a  real  polarization  and  let  /  be  a  smooth  polarized  function 
on  AT,  that  is,  one  for  which  derivatives  in  the  direction  of  P  are 
zero.  Show  that  Q(f)  acts  on  the  half- form  Hilbert  space  simply  as 
multiplication  by  /.  (Compare  Proposition  23.25  in  the  case  without 
half-forms.) 

Hint :  Show  that  Cxfot  =  0  whenever  a  is  a  polarized  section  of  JCp. 


12.  Using  the  identities  C[x,y] 
the  identity  (23.36). 


[Cx,Cy]  and  V/,<d  =  ixf>xg],  verify 


13.  Prove  that  if  P  is  a  real  polarization  on  TV,  it  is  possible  to  choose  a 
symplectic  potential  6  locally  in  such  a  way  that  6  is  zero  on  P. 

Hint :  Use  functions  /x  as  in  the  proof  of  Proposition  23.26. 


14.  Suppose  that  P  is  a  purely  real  polarization  on  N  and  9  is  a  local 
symplectic  potential  that  vanishes  on  P.  Suppose  also  that  /  is  a  real¬ 
valued  function  for  which  Xf  preserves  P.  Show  that  the  function 
—6(Xf)  —  f  is  constant  along  the  leaves  of  P. 

Hint :  If  X  is  a  vector  field  lying  in  P,  use  (21.6)  to  show  that  X(6(Xf))  = 
dO{X ,  Xf). 
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15.  Suppose  that  /?  is  a  nowhere  vanishing  n-form  on  an  oriented  manifold 
E,  that  X  is  a  real  vector  field  on  E,  and  that  <p  and  ip  are  smooth, 
compactly  supported  functions  on  E.  Verify  the  following  formula  for 
“integration  by  parts” : 

f  {X4>)-4>  P  =  -  [  4>{Xip)  p  -  f  4>ip (dW/3  x)  p, 


where  divp  X  is  the  function  such  that  CxP  =  (div^  X)f3. 

Hint :  If  <&t  is  the  flow  generated  by  X,  then  for  all  sufficiently  small 
£,  is  defined  for  all  x  in  the  support  of  pip  and  the  integral  of 

(PPP)  over  E  is  independent  of  t. 


16.  Let  the  notation  be  as  in  Exercise  8.  Then  the  canonical  bundle  for 
P  is  trivial,  with  trivializing  section  dz.  Take  Sp  to  be  trivial,  with 
trivializing  section  yfdz.  Show  that  every  polarized  section  s  of  L<g>5p 
is  of  the  form 

s  =  F(z)s0(z)  0  Vdz, 


where  F  is  holomorphic.  Show  that  the  norm  of  such  a  section  is,  up 
to  a  constant,  the  L2  norm  of  F  with  respect  to  a  measure  of  the  form 
(1  —  \z\2)u ,  but  that  the  value  of  v  is  not  the  same  as  when  half- forms 
are  not  included. 


IT.  Let  P  be  a  Kahler  polarization  on  X,  let  zi, . . .  ,zn  be  holomorphic 
local  coordinates  on  X,  and  let  A  be  the  matrix  given  by 

,  (  9  d  \ 

Ajk  —  cz  I  ,  1  . 

V  ozk  J 

(a)  Show  that  the  matrix  iA  is  positive  definite. 

(b)  Show  that  uj  =  Ajk  dzj  A  dzk . 

(c)  Show  that  the  quantity  uj®n /n\  may  be  computed  as 

det(iA)(— l)Xn-1V2(— i^ndzi  A  •  •  •  A  dzn  A  dz\  A  •  •  •  A  dzn. 

(d)  Verify  Proposition  23.50. 

18.  Let  P  be  a  Kahler  polarization  on  X,  let  ^  be  a  fixed  square  root  of 
/Cp,  and  let  /  be  a  smooth,  real- valued  function  such  that  Xf  preserves 
P.  Throughout  this  problem,  if  s i  and  S2  are  local  sections  of  a  line 
bundle,  with  S2  nonvanishing,  S1/S2  will  denote  the  unique  function 
such  that  si  =  (51/52)52- 

(a)  Show  that  for  any  continuous  compactly  supported  function  ip 
on  X,  we  have 

[  Xf(ip)  A  =  0. 

J  N 
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Hint :  Use  Liouville’s  theorem. 

Note:  The  same  result  holds  if  Q  is  not  compactly  supported  but 
is  “sufficiently  nice.” 

(b)  If  v  is  a  local  nonvanishing  section  of  5p,  show  that 


£xfv  1  CXf(v®v) 


v  2  v  ®  v 

(c)  If  a  is  any  2n-form  on  TV,  show  that 


(d)  Suppose  s i  and  s2  are  polarized  sections  of  L(g)5p,  decomposed 
locally  as  Sj  =  Hj  ®  Vj,  j  =  1,2.  Show  that 

zX/(si,s2)  =  WVxfu)  G)^i,s2)  +  {ifii  (8>  (CXfvi)  ®  s2) 

+  (5ig(VX/M2)  0  *A>)  +  O1UM2  0  (CXfV2)), 

where  (•,  •)  is  computed  with  respect  to  the  Hermitian  structure 
on  L  0  Sp  described  in  Sect.  23.7. 

Hint:  Use  the  identity  Cxf  (<a  A  (3)  =  ( Cxf  ot)  A  f3  +  a  A  (Cxf/3)- 

(e)  Suppose  si  and  s2  are  polarized  sections  of  L  ®  5p  belonging  to 
the  domain  of  Q(f)  and  such  that  (si,s2)  is  “sufficiently  nice.” 
Show  that 


(si,Q(f)s2)  =  (Q(f)si,s2)  ■ 


Appendix  A 

Review  of  Basic  Material 


A. l  Tensor  Products  of  Vector  Spaces 

Given  two  vector  spaces  V\  and  V2  over  C,  the  tensor  product  is  a  new  vector 
space  V10V2,  together  with  a  bilinear  “product”  map  G  :  V\  x  V2  —t  Vi  (g)  V2 . 
If  V\  and  V2  are  finite  dimensional  with  bases  {uj}  and  {p/e},  then  V\  (g)  V2 
is  finite  dimensional  with  {uj  <S>Vk}  forming  a  basis  for  V\  (g)  V2.  In  the  finite¬ 
dimensional  case,  we  could  simply  define  the  tensor  product  by  this  basis 
property,  but  then  we  would  have  to  worry  about  whether  the  construction 
is  basis  independent.  Instead,  we  define  V\  (g)  V2  by  a  “universal  property.” 

Definition  A.l  Suppose  V\  and  V2  are  vector  spaces  over  a  field  F.  Then 
a  tensor  product  of  V\  and  V2  is  a  vector  space  W  over  F  together  with 
a  bilinear  map  T  :  V\  x  V2  —>  W  having  the  following  “ universal  property 
If  U  is  any  vector  space  over  F  and  $  :  V\  x  V2  U  is  a  bilinear  map, 
then  there  exists  a  unique  linear  map  :  W  U  such  that  the  following 
diagram  commutes: 

Gi  x  G2 

U 

Proposition  A. 2  For  any  two  vector  spaces  V\  and  V2,  a  tensor  product 
of  Vi  and  V2  exists  and  is  unique  up  to  ucanonical  isomorphism.  ”  That  is, 
for  two  tensor  products  (W\,T\)  and  there  is  a  unique  invertible 

linear  map  ^  :  W\  -A  W2  such  that  T2  =  ^  o  T\. 
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In  light  of  the  uniqueness  result,  we  may  speak  of  “the”  tensor  product  of 
Vi  and  V2.  We  choose  any  one  tensor  product  and  we  denote  it  by  Vi  <8>  V2- 
We  also  denote  the  linear  map  T  :  V\  x  V2  —>  V\  0  V2  as  (u,  v)  u  0  v.  In 
this  notation,  the  universal  property  reads  as  follows:  Given  any  bilinear 
map  $  of  Vi  x  V2  into  a  vector  space  Z7,  there  exists  a  unique  linear  map 

:  V\  0  V2  — >  U  such  that 

&(u  0  v)  =  4>(u,  u). 

Proposition  A. 3  If  Vi  and  V2  are  finite- dimensional  vector  spaces  with 
bases  { Uj }JI1  and  {,e/c}^1,  then  V\  0  V2  is  finite  dimensional  and  the  set 
of  elements  of  the  form  Uj  Evi c,  1  <  j  <  ni,  1  <  fc  <  712,  forms  a  basis  for 
Vi  0  V2 .  la  particular, 

dim(Vi  0  V2)  =  (dim  Vf) (dim  V2). 

It  should  be  emphasized  that,  in  general,  not  every  element  of  V\  0  V2 
is  of  the  form  u  0  v  with  u  E  V\  and  u  G  b-  All  we  can  say  is  that  each 
element  of  Vi  0  V2  can  be  decomposed  as  a  linear  combination  of  elements 
of  the  form  u  0  This  decomposition,  furthermore,  is  far  from  canonical; 
even  in  the  finite-dimensional  case,  it  depends  on  a  choice  of  bases  for  V\ 
and  V2.  Nevertheless,  the  universal  property  of  the  tensor  product  tells  us 
that  we  can  define  linear  maps  from  V\  <S>  V2  to  any  vector  space  U,  simply 
by  defining  them  on  elements  of  the  form  u  <S>  v.  Provided  that  &(u,v)  is 
bilinear  in  u  and  v,  the  universal  property  tells  us  that  there  is  a  unique 
linear  map  <f>  on  V\  G  V2  such  that  on  element  of  the  form  u®v,  <f>  is  equal 
to  $>(u,v).  A  representative  application  of  the  universal  property  is  in  the 
following  result. 

Proposition  A. 4  If  A  E  End(1Vi)  and  B  £  End(V2),  there  exists  a  unique 
linear  map  A  0  B  :  V\  (8)  V2  -A  V\  (8)  V2  such  that 

(. A  0  B)(u  0  v)  =  (Au)  0  (Bf). 

For  Ai,  A2  E  End(Vi)  and  B\,B2  E  End(1V2),  we  have 

(Ai  0  Bi)(A2  0  B2)  =  (A1A2)  0  (B1B2). 

To  construct  A®  B,  we  apply  the  universal  property  with  U  =  V\  0  V2 
and  T(^,'e)  =  (Au)  0  (5u).  Since  A  and  5  are  linear  and  0  is  bilinear,  <f> 
is  bilinear.  The  linear  map  :  V\  0  V2  -A  Vi  0  V2  is  then  the  map  that  we 
denote  A  E  B. 

The  tensor  product,  as  we  have  defined  it  in  this  section,  applies  to 
ah  vector  spaces,  whether  finite  dimensional  or  infinite  dimensional.  The 
construction,  however,  is  purely  algebraic;  if  there  is  a  topology  on  V\  and 
V2,  the  tensor  product  takes  no  account  of  that  topology.  In  the  Hilbert 
space  setting,  then,  we  will  have  to  refine  the  notion  of  the  tensor  product 
so  that  the  tensor  product  of  two  Hilbert  spaces  will  again  be  a  Hilbert 
space.  See  Sect.  A. 4. 5. 
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It  is  assumed  that  the  reader  is  familiar  with  the  basic  notions  of  measure 
theory,  including  the  concepts  of  cr-algebras,  measures,  measurable  func¬ 
tions,  and  the  Lebesgue  integral.  A  triple  (X,  p),  consisting  of  a  set  X,  a 

cr-algebra  of  subsets  of  X,  and  a  (non-negative)  measure  p  on  TL  is  called 
a  measure  space.  A  measurable  function  fj  :  X  -A  C  is  said  to  be  integrable 
if  fx  \if\  dp  <  oo.  The  cr-algebra  generated  by  any  collection  of  subsets  of  a 
set  X  is  the  smallest  cr-algebra  of  subsets  of  X  containing  that  collection. 

We  assume  those  parts  of  measure  theory  that  are  entirely  standard:  the 
monotone  convergence  and  dominated  convergence  theorems,  Lp  spaces, 
and  Fubini’s  theorem.  We  briefly  review  a  few  other  topics  that  might  not 
be  as  familiar. 

A  measure  p  on  a  measurable  space  (X,  p)  is  said  to  be  cr- finite  if  X  can 
be  written  as  a  countable  union  of  measurable  sets  of  finite  measure. 


Definition  A. 5  Suppose  p  and  v  are  two  a -finite  measures  on  a  measure 
space  (X,  Q).  Then  we  say  that  p  is  absolutely  continuous  with  respect 
to  v  if  for  all  E  E  $1,  if  v(E)  =  0  then  p{E)  =  0.  We  say  that  p  and  v 
are  equivalent  if  each  measure  is  absolutely  continuous  with  respect  to  the 
other. 

Theorem  A. 6  (Radon— Nikodym)  Suppose  p  and  v  are  two  a -finite 
measures  on  a  measure  space  (X,  St)  and  that  p  is  absolutely  continuous 
with  respect  to  v.  Then  there  exists  a  non-negative,  measurable  function  p 
on  X  such  that 


for  all  E  G  Si.  The  function  p  is  called  the  density  of  p  with  respect  to  v. 

Definition  A. 7  A  collection  Xi  of  subsets  of  a  set  X  is  called  a  mono¬ 
tone  class  if  Xi  is  closed  under  countable  increasing  unions  and  countable 
decreasing  intersections. 

A  countable  increasing  union  means  the  union  of  a  sequence  Ej  of  sets 
where  Ej  is  contained  in  Fq+i  for  each  j,  with  a  similar  definition  for 
countable  decreasing  intersections. 

Theorem  A. 8  (Monotone  Class  Lemma)  Suppose  Xi  is  a  monotone 
class  of  subsets  of  a  set  X  and  suppose  Xi  contains  an  algebra  A  of  subsets 
of  X.  Then  Xi  contains  the  cr-algebra  generated  by  A. 

Corollary  A. 9  Suppose  p  and  v  are  two  finite  measures  on  a  measure 
space  (X,  St).  Suppose  p  and  v  agree  on  an  algebra  A  C  Q.  Then  p  and  v 
agree  on  the  cr-algebra  generated  by  A. 

Note  that  in  general,  the  collection  of  sets  on  which  two  measures  agree 
is  not  a  cr-algebra,  nor  even  an  algebra. 
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Theorem  A.  10  Suppose  p  is  a  measure  on  the  Borel  cr -algebra  in  a  locally 
compact,  separable  metric  space  X.  Suppose  also  that  p{K)  <  oo  for  each 
compact  subset  K  of  X.  Then  the  space  of  continuous  functions  of  compact 
support  on  X  is  dense  in  Lp[fX ,  p),  for  all  p  with  1  <  p  <  oo. 


A  word  of  clarification  is  in  order  here.  If  if  is  a  continuous  function  on 
X  with  compact  support,  then  fx  \if\p  dp  is  finite,  since  if  is  bounded  and 
p  is  finite  on  compact  sets.  Thus,  we  can  define  a  map  from  Cc(X)  into 
Lp(X,p)  by  mapping  a  continuous  function  if  of  compact  support  to  the 
equivalence  class  [if].  The  theorem  is  asserting,  more  precisely,  that  the 
image  of  Cc(X)  under  this  map  is  dense  in  Lp(X,p).  It  should  be  noted, 
however,  that  the  map  if  [if]  need  not  be  injective.  After  all,  if  there 
is  a  nonempty  open  set  U  inside  X  with  p(U)  =  0,  then  for  any  if  with 
support  contained  in  U,  the  equivalence  class  [if]  will  be  the  zero  element  of 
LP(X ,  p).  Nevertheless,  we  will  allow  ourselves  a  small  abuse  of  terminology 
and  say  that  Cc(X)  is  dense  in  Lp(X,p). 


A. 3  Elementary  Functional  Analysis 

In  this  section,  we  briefly  review  some  of  the  results  from  elementary  func¬ 
tional  analysis  that  we  make  use  of  the  text.  Most  of  these  results  can  be 
found  in  the  book  of  Rudin  [32] . 

A. 3.1  The  Stone- Weierstrass  Theorem 

The  Weierstrass  theorem  states  that  every  continuous,  real- valued  function 
on  an  interval  can  be  uniformly  approximated  by  polynomials.  A  substan¬ 
tial  generalization  of  this  was  obtained  by  Stone.  If  A  is  a  compact  metric 
space,  let  C(A;R)  and  C(A;C)  denote  the  space  of  continuous  real-  and 
complex- valued  continuous  functions,  respectively.  A  subset  A  of  C(A; F) 
is  called  an  algebra  if  it  is  closed  under  pointwise  addition,  pointwise  mul¬ 
tiplication,  and  multiplication  by  elements  of  F,  where  F  =  R  or  C.  An 
algebra  A  is  said  to  separate  points  if  for  any  two  distinct  points  x  and  y 
in  A,  there  exists  /  E  A  such  that  f(x)  f(y)-  We  use  on  C(A;F)  the 
supremum  norm ,  given  by 

ll/llsup  :=  SUP  \f(x)\  , 
xex 

and  C(A,  F)  is  complete  with  respect  to  the  associated  distance  function, 

d{f,g)  =  11/  —  5llsup  • 

Theorem  A.  11  (Stone— Weierstrass,  Real  Version)  Let  X  be  a  com¬ 
pact  metric  space  and  let  A  be  an  algebra  in  C{ A;R).  If  A  contains  the 
constant  functions  and  separates  points,  then  A  is  dense  in  C(A;R)  with 
respect  to  the  supremum  norm. 
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Theorem  A.  12  (Stone— Weierstrass,  Complex  Version)  Let  X  be  a 

compact  metric  space  and  let  A  be  an  algebra  in  C(X;C).  If  A  contains  the 
constant  functions ,  separates  points,  and  is  closed  under  complex  conjuga¬ 
tion,  then  A  is  dense  in  C(X;  C)  with  respect  to  the  supremum  norm. 

A  consequence  of  the  complex  version  of  the  Stone- Weierstrass  theorem 
is  the  following:  If  A  is  a  compact  subset  of  C,  then  every  continuous, 
complex-valued  function  on  K  can  be  uniformly  approximated  by  polyno¬ 
mials  in  z  and  z. 

A. 3. 2  The  Fourier  Transform 

We  now  describe  the  Fourier  transform  on  Mn,  in  various  forms. 
Definition  A. 13  For  any  fj  E  L1(Mn),  define  the  Fourier  transform  of 

A 

to  be  the  function  j)  on  Mn  given  by 

/oo 

e-*k'x^(x)  dx. 

-oo 


1  ^ 

Proposition  A.  14  For  any  E  Li(Mn),  the  Fourier  transform  of  has 


/\  . 

the  following  properties:  (1)  ^(k)  <  (27r)_n/2  ||  fj 
and  (3)  ^(k)  tends  to  zero  as  |k|  tends  to  oo. 


L1 


/\ 

,  (2)  fj  is  continuous, 


The  bound  on  is  obvious  and  the  continuity  of  tjj  follows  from  dom- 
inated  convergence.  To  show  that  fj  tends  to  zero  at  infinity,  we  first  es¬ 
tablish  this  on  a  dense  subspace  of  L1(Mn)  (e.g.,  the  Schwartz  space;  see 
below)  and  then  take  uniform  limits. 


Definition  A. 15  The  Schwartz  space  <S(Mn)  is  the  space  of  all  C°°  func¬ 
tions  on  Mn  such  that 


lim 


:00 


x*^9k'0(x)|  =  0 


for  all  n-tuples  of  non-negative  integers  j  and  k.  Here  if  j  =  (j i, . . .  ,jn) 
then  x-i  =  x°f  •  •  •  xfp  and 


An  element  of  the  Schwartz  space  is  called  a  Schwartz  function. 

Proposition  A. 16  If  j)  belongs  to  S(Wl ),  then  j)  also  belongs  to  S(Rn). 

The  proof  of  this  result  hinges  on  the  behavior  of  the  Fourier  transform 
under  differentiation  and  under  multiplication  by  x,  results  which  are  of 
interest  in  their  on  right. 
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Proposition  A.  17  If  ip  is  a  Schwartz  function ,  the  following  properties 
hold 

1.  We  have 

J“(k)  =  ikM  k).  (A.l) 

/s 

2.  The  function  ip  is  differentiable  at  every  point  and  the  Fourier  trans¬ 
form  of  the  function  xjipfx)  is  given  by 

^iV’(k)  =  ifjf-F k)-  (A. 2) 


The  first  point  is  proved  by  integration  by  parts  and  the  second  by  dif- 
ferentiation  under  the  integral  in  the  definition  of  ip. 

Theorem  A.  18  (Fourier  Inversion  and  Plancherel  Formula,  I)  The 

Fourier  transform  on  S(MT)  has  the  following  properties. 

1.  The  Fourier  transform  maps  the  Schwartz  space  onto  the  Schwartz 
space. 

2.  For  all  ip  E  <S(Mn),  the  function  ip  can  be  recovered  from  its  Fourier 
transform  by  the  Fourier  inversion  formula: 

/oo 

eik'^(k)  dk. 

-OO 


3.  For  all  ip  E  <S(Mn),  we  have  the  Plancherel  theorem: 


dx 


ip( k)|2  dk. 


Since  the  Schwartz  space  is  dense  in  L2(Mn),  the  BLT  theorem  and  Theo¬ 
rem  A.  18  imply  that  the  Fourier  transform  extends  uniquely  to  an  isometric 
map  of  L2(Mn)  onto  L2(Mn). 

Theorem  A.  19  (Fourier  Inversion  and  Plancherel  Theorem,  II) 

The  Fourier  transform  extends  to  an  isometric  map  T  of  L2(Mn)  onto 
L2(Mn).  This  map  may  be  computed  as 

F{ip){ k)  =  (27r)-n/2  lim  f  e_2k‘x'0(x)  dx,  (A. 3) 

A^°°d|x|<A 

where  the  limit  is  in  the  norm  topology  of  L2(Mn).  The  inverse  map  T~x 
may  be  computed  as 

n'2  lim  [  eik  x 

A^°°  d  |x|  < A 
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If  ip  belongs  to  L 1(Mn)  D  L2(Mn),  then  by  dominated  convergence,  the 
limit  in  coincides  with  the  L 1  Fourier  transform  in  Definition  A.  13. 

Definition  A. 20  For  two  measurable  functions  <p  and  ip,  define  the  con¬ 
volution  <p  *  ip  of  p  and  ip  by  the  formula 


0*yi(x) 


y)^(y)  dy, 


provided  that  the  integral  is  absolutely  convergent  for  all  x. 

Proposition  A. 21  Suppose  that  <p  and  ip  belong  to  L  1(M”)nL2(]R").  Then 
<fi  *  ip  is  defined  and  belongs  to  L  1(Rn)  D  L2(Mn)  and  we  have 

(: 27r)-n/2F{p  *  ip)  =  F(fi)F{ip). 

This  result  is  proved  by  plugging  <p  *  ip  into  the  definition  of  the  Fourier 
transform,  writing  e_2kx  as  e_2k'ye_zk'(x_y),  and  using  Fubini’s  theorem. 
We  will  have  occasion  to  use  the  following  Gaussian  integral. 

Proposition  A. 22  For  all  a  >  0  and  b  E  C,  we  have 

i  r°° 

-fi=  /  e-*2/(2 a)ebx  dx  =  y^ea62/2> 

V  27T  J  —  QQ 

Taking  b  =  ik  in  the  last  part  of  the  proposition  gives  us  the  Fourier 
transform  of  the  Gaussian  function  e~x  /(2°0.  Taking  5  =  0  allows  us  to 
determine  the  proper  normalization  of  the  Gaussian  probability  density. 


A.  3. 3  Distributions 

In  this  section  we  give  a  brief  account  of  the  theory  of  distributions — what 
physicists  call  “generalized  functions” — including  the  notion  of  “derivative 
in  the  distribution  sense.” 

The  idea  is  that  we  study  functions  by  studying  their  integral  against 
some  class  of  very  nice  “test  functions.”  Consider,  for  example,  a  locally 
integrable  function  /  and  consider  integrals  of  the  form 

f  x(x)/(x)  dx, 

JRn 

where  \  belongs  to  C^°(IRn),  the  space  of  smooth,  compactly  supported 
functions.  We  might  think,  for  example,  that  \  is  positive,  has  integral 
equal  to  1,  and  is  supported  near  some  point  a  E  Mn.  In  that  case,  the 
integral  (A. 4)  is  an  approximation  to  the  value  of  /  at  a,  what  physicists 
describe  as  a  “smeared  out”  version  of  /(a). 


(A.4) 
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Proposition  A. 23  Suppose  fi  and  are  locally  integrable  functions  on 
Mn.  If 

f  x(x)/i(x)  dx=  f  x(x)/2(x)  dx 

J  J  Rn 

for  all  x  £  Cf°(Rn),  then  /i(x)  =  /2(x)  for  almost  every  x. 

The  idea  now  is  that  we  allow  objects  that  do  not  have  values  at  points, 
but  for  which  something  like  (A. 4)  makes  sense.  Mathematically,  we  think 
of  (A. 4)  as  a  linear  functional  on  C^°(Wl). 

Definition  A. 24  A  sequence  Xm  £  Cf°(Rn)  is  said  to  converge  to  x  £ 
Cf°(Rn)  if  (1)  there  exists  a  single  compact  set  K  containing  the  support 
of  all  the  Xn’s,  (2)  X m  converges  uniformly  to  y,  and  (3)  each  derivative 
°f  Xm  converges  uniformly  to  the  corresponding  derivative  of  y. 

Definition  A. 25  A  distribution  on  Mn  is  a  linear  map  T  :  C^°(Rn)  C 
having  the  following  continuity  property:  If  Xm  converges  to  x  tn  the  sense 
of  Definition  A. 24,  T(ym)  converges  to  T(y). 

The  continuity  condition  on  T  should  be  regarded  as  a  technicality,  in 
that  any  functional  that  is  well  defined  and  linear  on  all  of  and  is 

obtained  in  a  reasonably  constructive  fashion  will  satisfy  this  property. 

Example  A. 26  The  Dirac  8 -  “function”  is  the  distribution  S  defined  by 

<5(x)  =  X(0). 

Definition  A. 27  If  T  is  a  distribution  and  f  is  a  locally  integrable  func¬ 
tion,  the  expression  CT  is  equal  to  f”  or  CT  is  given  by  f”  means  that 


T(x)  =  [  X(x)/(x)  dx 

JRn 


for  all  x  £  C(?°(Mn). 

Definition  A. 28  If  T  is  a  distribution,  define  the  distribution  d T / dxj  by 
the  formula 

M  -  -T  (  dx 


dx, 


dx, 


It  is  easy  to  verify  that  if  T  has  the  continuity  property  in  Definition 
A. 25,  then  so  does  dT/dxj.  Furthermore,  if  T  is  given  by  a  continuously 
differentiable  function,  then  the  derivative  of  T  is  in  the  distribution  sense 
coincides  with  the  derivative  of  T  in  the  classical  sense,  as  can  easily  be 
shown  using  integration  by  parts.  If  T  is  a  distribution,  we  may  define  AT 
by  repeated  applications  of  Definition  A. 28,  with  the  result  that 


(A  T)(X)  =  T(  AX). 
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Proposition  A. 29  If  <p  and  fj  are  L 2  functions ,  the  equation  dip/dxj  =  <fi 
holds  in  the  distribution  sense  if  and  only  if 

-  (ly) = <*■*> 

for  all  x  £  Cf°(Rn).  Similarly,  the  equation  Aip  =  cp  holds  in  the  distribu¬ 
tion  sense  if  and  only  if 

(Ax,  V’}  =  (x,<P) 

for  all  x  G  Cc°°(R"). 

Proposition  A. 30  If  T  is  a  distribution  on  R  and  dT / dx  is  the  zero  dis¬ 
tribution,  then  T  is  a  constant,  meaning  that  there  is  some  constant  c  such 
that 

/oo 

X(x)c  dx.  (A. 5) 

-OO 

Suppose,  in  particular,  that  if  T  is  given  by  a  locally  integrable  function  /, 
and  the  derivative  of  T  is  zero.  Then  Proposition  A. 30  tells  us  that  for  some 
constant  c,  we  have  x(x){fix)  —  c)  dx  =  0  for  all  x  E  C£°(R).  Then 
Proposition  A. 23  tells  us  that  f(x)  =  c  almost  everywhere.  This  means  that 
if  the  derivative  of  /  is  zero,  even  in  the  weak  (or  distributional)  sense,  then 
/  must  be  constant. 


A. 3. 4  Banach  Spaces 

In  this  section,  we  define  Banach  spaces  and  describe  some  of  their  elemen¬ 
tary  properties. 


Definition  A. 31  A  norm  on  a  vector  space  V  over  F  (¥  =  M  or  C)  is  a 
map  from  V  into  R,  denoted  ip  i/j  ,  with  the  following  properties. 


1.  For  all  ip  G  V,  H^H  >  0,  with  equality  if  and  only  if  ip  =  0. 


2.  For  all  if  V  and  cgF,  we  have  \\cip 


c 


3.  For  all  0,  ip  E  V,  we  have  \\<p  +  ip  I  <  ||0||  +  ip 


If  ||- 1|  is  a  norm  on  V,  then  we  can  define  a  distance  function  d  on  V  by 
setting  d(<p,ip)  =  HV’  — 


Definition  A. 32  A  normed  vector  space  is  said  to  be  a  Banach  space 
if  it  is  complete  with  respect  to  the  associated  distance  function.  A  Banach 
space  is  said  to  be  separable  if  contains  a  countable  dense  subset. 


One  important  class  of  examples  of  Banach  spaces  are  the  Lp  spaces. 


536  Appendix  A.  Review  of  Basic  Material 


Definition  A. 33  An  infinite  series,  with  values  in  normed  space 

V,  is  said  to  converge  if  there  exists  some  L  E  V  such  that 


lim  IIS' 

N^oo 


n 


where  SN  =  En=i  i’n- 

Proposition  A. 34  IfV  is  a  Banach  space,  then  absolute  convergence  im¬ 
plies  convergence  in  V .  That  is,  if 


}  ]  II 'tpn 

n= 1 


<  00, 


then  Va  converges  in  V. 


Definition  A. 35  If  V\  and  V2  are  normed  spaces,  a  linear  map  T  :  V\ 

V2  is  bounded  if 

IIVi/’H  ,A 

sup  — — <  00.  (A.6) 

^evi\{o}  \m\ 

IfT  is  bounded,  then  the  supremum  in  (A.6)  is  called  the  operator  norm 
of  T,  denoted  ||T||  . 

Theorem  A. 36  (Bounded  Linear  Transformation  Theorem)  Let\ \ 
be  a  normed  space  and  V2  a  Banach  space.  Suppose  W  is  a  dense  subspace 
of  V\  and  T  :  W  — >  V2  is  a  bounded  linear  map.  Then  there  exists  a  unique 
bounded  linear  map  T  :  V\  —>  V2  such  that  T\w  =  T.  Furthermore,  the 
norm  of  T  equals  the  norm  ofT. 


Definition  A. 37  If  V  is  a  normed  space  over  F  (F  =  R  or  C),  then  a 
bounded  linear  functional  on  V  is  a  bounded  linear  map  of  V  into  F, 
where  on  F  we  use  the  norm  given  by  the  absolute  value.  The  collection  of 
all  bounded  linear  functionals,  with  the  norm  given  by  (A.6),  is  called  the 
dual  space  to  V,  denoted  V*. 


Theorem  A. 38  If  V  is  a  normed  vector  space,  then  the  following  results 
hold. 


1.  The  dual  space  R*  is  a  Banach  space. 

2.  For  all  E  V,  there  exists  a  nonzero  (Gf*  such  that 


ll£ll  IIV' 


In  particular,  if  =  0  for  all  (  G  V*,  then  =  0. 


Theorem  A. 39  (Closed  Graph  Theorem)  Suppose  that  V\  is  a  Banach 
space  and  V2  a  normed  vector  space.  For  any  linear  map  T  :  V\  -A  V2,  let 
Graph(T )  denote  the  set  of  pairs  in  V\  x  V2  such  that  1)  E  V\.  If 

the  graph  of  T  is  a  closed  subset  of  Vi  x  V2,  then  T  is  bounded. 
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Here  is  a  simple  example  of  how  the  closed  graph  theorem  can  be  applied. 
Suppose  V\  and  V2  are  Banach  spaces  and  T  :  V\  -A  V2  is  a  linear  map  that 
is  one-to-one,  onto,  and  bounded.  Then  the  inverse  map  T-1  :  V2  —>  V\  is 
automatically  bounded.  To  verify  this,  we  first  check  that  if  T  is  bounded, 
then  the  graph  of  T  is  closed  (easy).  Then  we  observe  that  the  graph  of 
T-1  is  also  closed,  since  it  is  obtained  from  the  graph  of  T  by  the  map 
(0,  if)  ca  (-0,  0).  Thus,  the  theorem  tells  us  that  T-1  is  bounded. 

Theorem  A. 40  (Principle  of  Uniform  Boundedness)  Suppose  {Ta} 
is  any  family  of  bounded  linear  maps  from  a  Banach  space  V\  to  a  normed 
space  V2.  Suppose  that  for  each  if  E  Vf,  there  is  a  constant  C ^  such  that 
||Ta0||  <  Up  for  all  a.  Then  there  exists  a  constant  C  such  that  \\Ta\\  <  C 
for  all  a. 

That  is,  in  contrapositive  form,  if  the  family  {Ta}  is  unbounded,  {Taif} 
must  be  unbounded  on  if  for  some  if  £  V\. 

Corollary  A. 41  Suppose  V  is  a  Banach  space  and  E  is  a  nonempty  subset 
of  V.  Suppose  that  for  all  £  E  V*  there  exists  a  constant  C f  such  that 
|£(0)|  <  C £  for  all  if  G  E.  Then  E  is  a  bounded  set. 

The  corollary  is  obtained  by  identifying  each  if  E  V  with  the  linear  map 
ey  :  U*  -A  C  given  by  evaluation  on  if ;  that  is,  e^{f)  =  £(-0).  Note  that  by 
Point  2  of  Theorem  A. 38,  the  norm  of  e ^  as  an  element  of  V**  is  equal  to 
the  norm  of  if  as  an  element  of  V. 


A. 4  Hilbert  Spaces  and  Operators  on  Them 

A. 4-1  Inner  Product  Spaces  and  Hilbert  Spaces 

We  now  introduce  a  generalization  to  arbitrary  vector  spaces  over  R  or  C 
of  the  usual  inner  product  (or  dot  product)  on  Mn. 

Definition  A. 42  An  inner  product  on  a  vector  space  over  F  (¥  =  R  or 
C)  is  a  map  (•,•):  V  x  V  F  with  the  following  properties. 

1.  For  all  <f,if  £  V,  we  have  (if,<f)  =  (0, 'if). 

2.  For  all  f  E  V,  ( <f ,  <f)  is  real  and  non-negative,  and  ( <f ,  <f)=  0  only  if 
0  =  0. 

3.  For  all  <f,if<EV  and  c  £  F,  we  have  ( c<f ,  if)=c  (0,  if)  and  (0,  cif)  = 

c(0,b)  • 

4.  For  all  (f,if,x  £  V>  we  have  (0  +  if,  x)  =  (0?  x)  +  (0?  x)  and 


(0A  +  x)  = 
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Note  that  we  are  following  the  physics  convention  of  taking  the  complex 
conjugate  in  Point  3  of  the  definition  on  the  first  factor  in  the  inner  product. 


Proposition  A. 43  If  V  is  an  inner  product  space ,  then  for  all  <f>,  E  V, 
we  have  the  Cauchy -Schwarz  inequality : 


(<t>,ip)\2  <  (■)/>, fZ1) 


Furthermore ,  if 


:  V  -A  R  is  defined  by 


fj 


,i’), 


(A.7) 


then 


is  a  norm  on  V. 


Definition  A. 44  A  Hilbert  space  is  a  vector  space  H  over  R  or  C, 
equipped  with  an  inner  product  (•,•),  such  that  H  is  complete  in  the  norm 
given  by  (A.7). 

That  is  to  say,  a  Hilbert  space  is  a  Banach  space  in  which  the  norm 
comes  from  an  inner  product.  In  Appendix  A. 4  only,  we  allow  H  to  denote 
an  arbitrary  Hilbert  space  over  R  or  C.  (In  the  main  body  of  the  text,  H 
denotes  a  separable  complex  Hilbert  space.) 

Definition  A. 45  Suppose  H j  is  a  sequence  of  separable  Hilbert  spaces. 
Then  the  Hilbert  space  direct  sum,  denoted 


oo 


H 


0Hi> 


3  = 1 


is  the  space  of  sequences  fj  =  (gbi ,  ^2,  ^3?  •  •  •)  such  that  £  Hn  and  such 
that 


00 


N 11^- 


3 


<  00. 


3= 1 


(A.8) 


The  finite  direct  sum  of  the  H j  ; s  is  the  set  of  fj  =  (ifi,  ^2,  fj 3 , . . .)  such 
that  fjj  =  0  for  all  but  finitely  many  values  of  j. 


We  define  an  inner  product  on  the  direct  sum  by  setting 


00 

(</’,'</’}  =  (A.9) 

3  = 1 

for  all  <f>,  f)  G  H.  This  inner  product  is  well  defined  and  H  is  complete  with 
respect  to  this  inner  product,  and  hence  a  Hilbert  space. 

One  important  example  of  a  Hilbert  space  is  L2(X,  /1),  where  (A,  fi)  is  a 
measure  space. 
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Definition  A. 46  If  (A,  y)  is  a  measure  space,  define  an  inner  product  on 
L2(X,y)  by  the  formula 

(fi,fi)  =  /  fi(x)fi(x)  dfifir).  (A. 10) 

J  x 

A  standard  result  in  measure  theory  states  that  the  integral  on  the  right- 
hand  side  of  (A. 10)  is  absolutely  convergent  for  all  fi  and  fi  in  L2(X,/j,). 
It  is  then  easy  to  verify  that  (•,  •)  is  indeed  an  inner  product  on  L2(X,  y). 
Another  standard  result  states  that  L2(X,y)  is  complete  with  respect  to 
the  norm  associated  with  the  inner  product  in  (A. 10);  thus,  L2(X,y)  is  a 
Hilbert  space. 


A.  4-  2  Orthogonality 

One  reason  that  Hilbert  spaces  are  nicer  to  work  with  than  general  Banach 
spaces  is  that  we  have  the  concept  of  orthogonality. 

Definition  A. 47  Two  elements  fi  and  fi  of  an  inner  product  space  are 

orthogonal  if  (fi,  fi)  =  0. 

Definition  A. 48  If  V  is  any  subspace  of  H,  define  a  subspace  V2-  of  H 
by 

V2-  =  {  fi  G  H|  (fi,  fi)  =0  for  all  fi  G  V}  . 

Then  V2-  is  called  the  orthogonal  space  of  V. 


Proposition  A. 49 


1.  IfV  is  a  closed  subspace  of  H,  every  fi  G  H  can  be  decomposed 
uniquely  as  fi  =  fii  +  fi>2 ,  with  fi\  G  V  and  fi>2  G  V2-. 

2.  IfV  is  any  subspace  of  H,  then  (H^)^  =  V,  where  V  is  the  closure 
of  V.  In  particular,  if  V  is  closed,  then  (H^)^  =  V. 

If  V  is  closed,  we  call  V2-  the  orthogonal  complement  of  V. 


Definition  A. 50  A  set  {ey }  of  elements  of  H,  where  j  ranges  over  an 
arbitrary  index  set,  is  said  to  be  orthonormal  if 


(ej 5  ek) 


0  j  fi  k 
1  j  =  k 


An  orthonormal  set  {ey }  is  an  orthonormal  basis  for  H  if  the  space  of 
finite  linear  combinations  of  the  Cj  ; s  is  dense  in  H. 


If  H  =  L2([—L,  L}),  for  some  positive  number  L,  then  the  functions, 


1  2irinx/L 

V2L 


n  G  Z, 


(A.ll) 


form  an  orthonormal  basis  for  H. 
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Proposition  A. 51  Suppose  {ej}  is  an  orthonormal  basis  for  H.  Then  ev¬ 
ery  ip  can  be  expressed  uniquely  as  a  convergent  sum 

^  =  Eaie?’  (A-12) 

3 


where  the  coefficients  are  given  by  aj  =  (ej,ip) .  If  ip  is  as  in  (A. 12),  then 


ip 


3 


Finally,  if  (aj)  is  any  sequence  such  that  JA 
unique  ip  G  H  such  that  (ej,  ip)  =  aj  for  all  j. 


<  oo,  there  exists  a 


In  the  case  that  the  orthonormal  basis  is  the  one  in  (A.  11),  the  resulting 
series  (A.  12)  is  called  the  Fourier  series  of  ip. 


A. If.. 3  The  Riesz  Theorem  and  Adjoints 

We  let  6( H)  denote  the  space  of  bounded  linear  maps  of  H  to  H.  It  is  not 
hard  to  show  that  6(H)  forms  a  Banach  space  under  the  operator  norm. 

Theorem  A. 52  (Riesz  Theorem)  in  :  H  -A  C  is  a  bounded  linear 
functional,  then  there  exists  a  unique  x  £  H  such  that 

=  (x,^> 

for  all  ip  G  H.  Furthermore,  the  operator  norm  of  £  as  a  linear  functional 
is  equal  to  the  norm  of  x  as  an  element  of  H. 

We  now  turn  to  the  concept  of  the  adjoint  of  a  bounded  operator,  along 
with  the  related  concept  of  quadratic  forms  on  H. 

Proposition  A. 53  For  any  A  G  6(H),  there  exists  a  unique  linear  oper¬ 
ator  A*  :  H  -a  H,  called  the  adjoint  of  A,  such  that 

(f,Aip)  =  (A*4>,tp) 

for  all  <p,  V’  G  H.  For  all  A,  B  G  23(H)  and  a,  f3  €  C  we  have 

(A*)*  =A 
( AB )*  =  B*A* 

(aA  +  PB)*  =aA*  +  (3B* 

I*  =/. 

The  operator  A*  is  bounded  and  ||V||  =  ||A|| . 
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Since  A  is  a  bounded  operator,  the  map  e-x  (0,  Ajj)  is  a  bounded  linear 
functional  for  each  fixed  <fi  E  H.  The  Riesz  theorem  then  tells  us  that  there 
is  a  unique  x  £  H  such  that  (</>,  Aip)  =  (y,  'if)  .  The  operator  A*  is  defined 
by  setting  A*</>  =  y.  It  is  not  hard  to  check  that  this  definition  makes  A* 
into  a  bounded  linear  operator. 

Definition  A. 54  An  operator  A  E  B(U)  is  said  to  be  self-adjoint  if 
A*  =  A  and  skew -self- adjoint  if  A*  =  —A. 

Definition  A. 55  An  operator  U  on  H  is  unitary  if  U  is  surjective  and 
preserves  inner  products,  that  is,  (U(j),Ujj)  =  (</>,  ip)  for  all  E  H. 

If  U  is  unitary,  then  U  preserves  norms  (||/7?/’||  =  HV’II  f°r  all  V’  £  H); 
therefore,  U  is  bounded  with  \\U\\  =  1.  By  the  polarization  identity  (Propo¬ 
sition  A. 59),  if  U  preserves  norms,  then  it  also  preserves  inner  products. 

Proposition  A. 56  A  bounded  operator  U  is  unitary  if  and  only  if  U*  = 
Z7-1,  that  is,  if  and  only  if  UU*  =  11*11  =  I. 

Proposition  A. 57  For  any  closed  subspace  V  C  H,  there  is  a  unique 
bounded  operator  P  such  that  P  =  I  on  V  and  P  =  0  on  the  orthogonal 
complement  V^.  This  operator  is  called  the  orthogonal  projection  onto 

V  and  it  satisfies  P2  =  P  and  P *  =  P. 

Conversely,  if  P  is  any  bounded  operator  on  H  satisfying  P2  =  P  and 
P *  =  P ,  then  P  is  the  orthogonal  projection  onto  a  closed  subspace  V,  where 

V  =  range(P). 


A. j  Quadratic  Forms 

In  this  section,  we  develop  the  theory  of  quadratic  forms  on  Hilbert  spaces. 
Since  this  is  customarily  done  only  for  the  inner  product  itself,  we  include 
the  proofs  of  the  results. 


Definition  A. 58  A  sesquilinear  form  on  H  is  a  map  L:HxHoC 

that  is  conjugate  linear  in  the  first  factor  and  linear  in  the  second  factor. 
A  sesquilinear  form  is  bounded  if  there  exists  a  constant  C  such  that 


L(cj),  'ijf)\  <C  \\(j)\\  \\j) 


for  all  ^eH, 

Proposition  A. 59  If  L  is  a  sesquilinear  form  on  H,  L  can  be  recovered 
from  its  values  on  the  diagonal  (i.e.,  the  value  of  for  various  jj’s) 

as  follows: 

L(4>,  $)  =  \  +  V  <t>  +  VO  -  L(<t>,  <!>)  -  L(i>,  VO' 

<1 

-  2  lL(&  +  #>  <t>  +  H 0  -  £(</>,  VO  -  Hill’,  iip) 


(A. 13) 


542  Appendix  A.  Review  of  Basic  Material 


This  formula  is  known  as  the  polarization  identity. 

Note  that  we  do  not  assume  any  relationship  between  +(</>,  if)  and  L(if,  <f>). 
Proof.  Direct  calculation.  ■ 


Definition  A. 60  A  quadratic  form  on  a  Hilbert  space  H  is  a  map  Q  : 

2 

H  C  with  the  following  properties:  (1)  Q(Xif)  =  |A|  Q(if)  for  all  0  G  H 
and  A  G  C,  and  (2)  the  map  L:HxHgC  defined  by 


=  \  m + 0)  -  q(4>)  -  q  m 

-  \  [<2(0  +  #)  -  Q{4>)  -  Q(iiP) 


is  a  sesquilinear  form.  A  quadratic  form  Q  is  bounded  if  there  exists  a 
constant  C  such  that 


<3(0)1  <  C 


2 


for  all  <f>  G  H.  The  smallest  such  constant  C  is  the  norm  of  Q. 


Proposition  A. 61  If  Q  is  a  quadratic  form  on  H  and  L  is  the  associated 
sesquilinear  form ,  we  have  the  following  results. 


1.  For  all  fj  G  H,  we  have  Q(fj)  =  L(pf),if). 

2.  If  Q  is  a  bounded,  then  L  is  bounded. 

3.  If  Q(gjf)  belongs  to  R  for  all  if  E  H,  then  L  is  conjugate  symmetric, 
that  is, 

L(0,0)  =  L(0,0) 

for  all  ip  E  H. 


Proof.  Point  1  of  the  proposition  is  verified  by  taking  <f>  = 
sion  for  and  then  using  the  relation  Q( \if)  =  |A 

2,  suppose  \Q(ip)\  <  (57  || gb || 2  for  all  if  G  H.  If  ||0||  =  || if 
and  f>  +  iif  have  norm  at  most  2,  and  so 


if  in  the  expres- 
Q(if).  For  Point 
=  1,  then  <p  +  if 


\L(<p,  if )  A  —C  (4+1  +  14-4  +  1  +  1)  —  6(7. 


Now,  for  any  <f  and  if  in  H,  we  can  find  unit  vectors  <p  and  if  such  that 


f  —  f  f  and  if  = 
have 

I  L(<t>,ip) 


if\\if.  Then  since  L  is  assumed  to  be  sesquilinear,  we 


if\\  L  [<f,if  \  <  6(7  \\<f\\  ||  if 


showing  that  L  is  bounded. 

For  Point  3,  assume  that  Q{if)  is  real  for  all  if  E  H  and  define  a  map 
M  :  H  x  H  — R  by 


M(0, 0)  =  \  [Q(0  +  0)  -  Q(0)  -  Q(0)]  =  Re  [L(<j>,  0) 
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Then  M  is  real-bilinear  (because  it  is  the  real  part  of  L )  and  symmetric 
(because  of  the  expression  for  M  in  terms  of  Q).  Furthermore,  M(i0,  20)  = 
M(q i>,  0).  These  properties  of  M  show  that  M(0,  i'lp)  =  —  M (0 ,  20) ,  and  so 

L(0,  0)  =  M (0,  0)  —  lM (0,  20) 

=  M (0,  0)  +  lM (0,  20) 

=  rfaMt 

which  is  what  we  wanted  to  prove.  ■ 

Example  A. 62  If  A  is  a  bounded  operator  on  H,  one  can  construct  a 
bounded  quadratic  form  Qa  on  H  by  setting 


Qa(A  =  ('*/’,  -4V0 ,  ip  €  H. 

The  associated  sesquilinear  form  La  is  then  given  by 

La{4>A)  =  ,  WeH. 

Proposition  A. 63  If  Q  is  a  bounded  quadratic  form  on  H,  there  is  a 
unique  A  £  /3(H)  snch  that  Q(0)  =  (0,  A0)  for  all  0  £  H.  If  Q(0)  belongs 
to  R  /or  all  0  £  H,  then  the  operator  A  is  self-adjoint. 


Proof.  Since  <3  is  bounded,  L  is  also  bounded,  meaning  that  there  exists 
a  constant  C  such  that  |L(0,0)|  <  C||0||  ||0||  for  all  0,  0  £  H.  Thus,  for 
any  0  £  H,  the  linear  functional  0  L(0,  0)  is  bounded,  with  norm  at 

most  C||0||  .  By  the  Riesz  theorem,  then,  there  exists  a  unique  y  £  H, 
with  llxH  <  (57  || 0||  ,  such  that  L(0, 0)  =  (x,  0).  We  now  define  a  map 
5  :  H  -T  H  by  defining  B<f>  =  y.  Direct  calculation  shows  that  B  is  linear, 
and  the  inequality  ||x||  <  C  ||0||  shows  that  B  is  bounded.  Setting  A  =  B* 
establishes  the  existence  of  the  desired  operator.  Uniqueness  of  A  follows 
from  the  observation  that  if  (0,  A0)  =  0  for  all  4>A  €  H,  then  A  is  the 
zero  operator. 

If  Q(0)  is  real  for  all  0  £  H,  then  by  Point  3  of  Proposition  A. 61,  L  is 
conjugate  symmetric.  Thus, 


(<£,  M>)  =  L(4>,  VO  =  PV  0)  =  (V  M)  =  (M,  VO 


for  all  0,  0  £  H,  showing  that  A  is  self-adjoint. 


A. 4-5  Tensor  Products  of  Hilbert  Spaces 

Recall  from  Appendix  A.l  the  concept  of  the  tensor  product  of  two  vector 
spaces. 
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Proposition  A. 64  Suppose  V\  and  V2  are  inner  product  spaces,  with  inner 
products  (•,  •)1  and  (•,  -)2.  Then  there  exists  a  unique  inner  product  (•,  •)  on 
V\  0  V2  snc/i  that 


{ui  0  -iq,  iz2  ®  v2)  =  (ixi,  ^2)1  (m  0  ^2)2 
/or  all  u\,u2  G  Vi  and  v\,v2  G  VA 

If  Hi  and  H2  are  Hilbert  spaces,  then  we  can  equip  the  tensor  product 
H10H2  with  the  inner  product  in  Proposition  A. 64.  If  Hi  and  H2  are  both 
infinite  dimensional,  however,  Hi  0  H2  will  not  be  complete  with  respect 
to  this  inner  product.  Nevertheless,  we  can  complete  Hi  0H2  with  respect 
to  this  inner  product,  thus  obtaining  a  new  Hilbert  space. 

Definition  A. 65  If  Hi  and  H2  are  Hilbert  spaces,  then  the  Hilbert  ten¬ 
sor  product  of  Hi  and  Jl2 ,  denoted  H10H2,  is  the  Hilbert  space  obtained 
by  completing  Hi  0  H2  with  respect  to  the  inner  product  in  Proposition 
A.  64. 

Proposition  A. 66  If  Hi  and  H2  are  Hilbert  spaces  with  orthonormal 
bases  {ej}  and  {/&},  respectively,  then  {ej  0/^}  is  an  orthonormal  basis 
for  the  Hilbert  space  H10H2. 

Proposition  A. 67  If  A  is  a  bounded  operator  on  Hi  and  B  is  a  bounded 
operator  on  H2,  then  there  exists  a  unique  bounded  operator  on  H10H2, 
denoted  A  0  7>,  such  that 


(. A  0  B){(j>  0  VO  =  (M)  ®  (Bif) 


for  all  (j)  G  Hi  and  G  H2. 

To  see  that  A®B  is  bounded,  first  write  A®B  as  (40/)(/05).  Then, 
given  any  orthonormal  basis  {fj}  for  H2,  we  can  decompose  H10H2  as  the 
Hilbert  space  direct  sum  of  subspaces  of  the  form  Hi  0  fj.  The  operator 
A  0  I  acts  on  this  decomposition  as  a  block-diagonal  operator  with  A  in 
each  diagonal  block.  From  this,  it  is  easy  to  verify  that  ||A  0  /  ||  =  ||A||.  A 
similar  argument  shows  that  1/05  =  \\B\\,  and  so 


|  A  0  B ||  <  ||  A  0  J||  \\I  0  B ||  =  ||  A||  || B 


Meanwhile,  by  taking  a  sequence  of  unit  vector  <fin  G  Hi  and  ipn  G  H2 
with  ||A(/n||  ||  A ||  and  ||7>/;n||  — >  ||H||  ,  we  see  that  the  reverse  inequality 


holds,  and  thus  that  ||  A  0  B ||  =  ||A||  || B 
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